0 



TCG ACT ATG AAT GCT GAT ACT GCT CCA ACA 
ser thr met asn ala asp thr ala pro thr 

61 

TCA AGC TCC TGC TCC AGC TTC CAG GAC CAG 
ser ser ser cys ser ser phe gin asp gin 



121 

AGA GTA CCC GCC AGC AGC ACT TCC TCA CCG 
arg val pro ala ser ser thr ser ser pro 

181 

CTG GAT GCC GAA GGG GAA GGA ATC AGC GAA 
leu asp ala glu gly glu gly ile ser glu 

241 

AGC CTG CTA AGT TCT CAC GAC CTG GAC CCA 
ser leu leu ser ser his asp leu asp pro 

301 

ATC GCC GCC CTT TAC CTA CCT TTA GTT GGC 
ile ala ala leu tyr leu pro leu val gly 

361 

GAC TTT ACA GTT GCA GAT ACT CGC AGA TAC 
asp phe thr val ala asp thr arg arg tyr 

421 

GGA GCC GGT GCC ATT ACC CAG AAT GTG GCT 
gly ala gly ala ile thr gin asn val ala 

481 

AAA ACA AGT GGA ATA GTG CTG TCT TCC TTG 

lys thr ser gly ile val leu ser ser leu 

54 1 

GAC ACT ACT CGC AAC CTC ATG ATC TGC TTC 
asp thr thr arg asn leu met ile cys phe 

601 

CTC ATT AGG AAG TGG ATT GCT GAC CTG CCA 
leu ile arg lys trp ile ala asp leu pro 



31 

TCT CCT TGT CCT TCC ATA TCT TCC CAG AAC 
ser pro cys pro ser ile ser ser gin asn 

91 

AAG ATC GCC AGC ATG TTC GAT CGG ACT TCC 
lys ile ala ser met phe asp arg thr ser 

Cadherin 
151 | xx EC motif xx | 

GGG CTC CTC TTC ACA GAA CTG GCT GCT GCC 
gly leu leu phe thr glu leu ala ala ala 

211 

GTA CAA AGG AAA GCT GTC AGT GCA ATT CAC 
val gin arg lys ala val ser ala ile his 

27 1 

CGC TGT GTC AAA CCA GAG GTG AAG GTC AAA 
arg cys val lys pre glu val lys val lys 

331 

ATC ATT TTG GAT GCT TTG CCA CAG CTC TGT 
ile ile leu asp ala leu pro gin leu cys 

391 

CGC ACC AGT GGC TCG GAT GAA GAA CAA GAA 
arg thr ser gly ser asp glu glu gin glu 

451 

CTG GCC ATA GCA GGG AAT AAT TTC AAT TTG 
leu ala ile ala gly asn asn phe asn leu 

511 

CCC TAT AAG CAG TAC AAC ATG CTG AAC GCG 
pro tyr lys gin tyr asn met leu asn ala 

571 

CTC TGG ATC ATG AAA AAT GCT GAT CAG AGC 
leu trp ile met lys asn ala asp gin ser 

631 

TCA ACG CAG CTC AAC AGG ATT TTA GAT CTA 
ser thr gin leu asn arg ile leu asp leu 



66L 691 

CTT TTC ATC TGT GTG TTA TGT TTT GAG TAT AAG GGA AAA CAG AGT TCT GAC AAA GTC AGT 
leu phe ile cys val leu cys phe glu tyr lys gly lys gin ser ser asp lys val set 

721 751 

ACC CAA GTC CTG CAG AAG TCA AGG GAT GTC AAG GCC CGG CTG GAA GAG GCT TTG CTG CGT 
thr gin val leu gin lys ser arg asp val lys ala arg leu glu glu ala leu leu arg 

781 811 

GGG GAA GGG GCC AGA GGG GAG ATG ATG CGC CGC CGG GCT CCA GGG AAC GAC CGA TTT CCA 
gly glu gly ala arg gly glu met met arg arg arg ala pro gly asn asp arg phe pro 

841 871 

GGC CTA AAT GAA AAT TTG AGA TGG AAG AAA GAG CAG ACA CAT TGG CGG CAA GCT AAT GAG 
gly leu asn glu asn leu arg trp lys lys glu gin thr his trp arg gin ala asn glu 

901 931 

AAG CTA GAT AAA ACA AAG GCC GAG TTA GAT CAA GAA GCC TTG ATC AGT GGC AAT CTG GCT 
lys leu asp lys thr lys ala glu leu asp gin glu ala leu ile ser gly asn leu ala 

961 991 

ACA GAA GCA CAT TTA ATC ATC CTG GAT ATG CAG GAA AAC ATT ATC CAG GCG AGC TCG GCT 
thr glu ala his leu ile ile leu asp met gin glu asn ile ile gin ala ser ser ala 

1021 1051 

CTG GAC TGT AAA GAC AGC CTG CTG GGA GGT GTT CTG AGG GTG CTG GTG AAT TCT CTG AAC 
leu asp cys lys asp ser leu leu gly gly val leu arg val leu val asn ser leu asn 

1081 1111 

TGT GAT CAG AGT ACC ACC TAC CTG ACT CAC TGC TTT GCA ACA CTC CGT GCT CTC ATC GCC 
cys asp gin ser thr thr tyr leu thr his cys phe ala thr leu arg ala leu ile ala 

1141 1171 

AAG TTT GGA GAC TTA CTC TTC GAA GAG GAG GTG GAA CAG TGT TTC GAC CTA TGT CAC CAA 
lys phe gly asp leu leu phe glu glu glu val glu gin cys phe asp leu cys his gin 

1201 1231 

GTC CTG CAC CAC TGC AGC AGC AGC ATG GAT GTC ACC CGG AGC CAA GCC TGT GCC ACC CTT 
val leu his his cys ser ser ser met asp val thr arg ser gin ala cys ala thr leu 

1261 1291 

TAC CTC CTC ATG AGG TTC AGT TTT GGA GCC ACC AGT AAT TTT GCA AGA GTA AAG ATG CAA 
tyr leu leu met arg phe ser phe gly ala thr ser asn phe ala arg val lys met gin 



1321 1351 

GTA ACC ATG TCC CTG GCA TCT TTG GTG GGA AGA GCA CCA GAC TTT AAT GAA GAG CAC CTG 
val thr met ser leu ala ser leu val gly arg ala pro asp phe asn glu glu his leu 



1381 1411 

AGA AGA TCC TTG AGG ACA ATT TTG GCC TAT TCA GAA GAG GAC ACA GCC ATG CAG ATG ACT 

arg arg ser leu arg thr ile leu ala tyr ser glu glu asp thr ala met gin met thr 

1441 1471 

CCT TTT CCC ACC CAG GTG GAG GAA CTT CTC TGT AAT CTG AAT AGC ATC TTA TAT GAC ACA 
pro phe pro thr gin val glu glu leu leu cys asn leu asn ser ile leu tyr asp thr 

1501 1531 

GTG AAA ATG AGG GAA TTT CAG GAA GAT CCT GAG ATG CTT ATG GAT CTC ATG TAC AGA ATT 
val lys met arg glu phe gin glu asp pro glu met leu met asp leu met tyr arg ile 

1561 1591 

GCC AAG AGT TAC CAG GCA TCT CCT GAT CTG CGG CTG ACC TGG CTC CAG AAC ATG GCA GAG 
ala lys ser tyr gin ala ser pro asp leu arg leu thr trp leu gin asn met ala glu 

1621 1 xxxxxxxxxxxxxxxxxxxxxx transmembrane domain xxxxxxx 

AAA CAC ACC AAG AAG AAG TGC TAC ACG GAG GCT GCC ATG TGC CTG GTG CAC GCC GCT GCG 
lys his thr lys lys lys cys tyr thr glu ala ala met cys leu val his ala ala ala 

xxxxxxxxxxxxxxxxxxxxxxx | 1711 

TTA GTG GCT GAG TAT CTG AGC ATG CTG GAG GAC CAC AGC TAC CTG CCC GTG GGC AGT GTC 
leu val ala glu tyr leu ser met leu glu asp his ser tyr leu pro val gly ser val 

1741 1771 

AGC TTC CAG AAT ATT TCT TCC AAT GTG CTG GAG GAG TCT GTG GTC TCT GAG GAC ACC CTG 
ser phe gin asn ile ser ser asn val leu glu glu ser val val ser glu asp thr leu 

1801 1831 

TCA CCT GAC GAG GAT GGG GTG TGC GCA GGC CAG TAC TTC ACC GAG AGT GGC CTG GTA GGC 
ser pro asp glu asp gly val cys ala gly gin tyr phe thr glu ser gly leu val gly 

1861 1891 

CTC CTG GAG CAG GCC GCG GAG CTC TTC AGC ACG GGA GGC TTA TAT GAG ACA GTT AAT GAG 
leu leu glu gin ala ala glu leu phe ser thr gly gly leu tyr glu thr val asn glu 

1921 1951 

GTC TAC AAG CTG GTC ATC CCC ATC CTA GAA GCG CAT CGA GAA TTC CGG AAG CTG ACA CTC 
val tyr lys leu val ile pro ile leu glu ala his arg glu phe arg lys leu thr leu 

1981 2011 

ACT CAC AGC AAG CTG CAG AGA GCC TTC GAC AGC ATC GTT AAC AAG GAT CAT AAG AGA ATG 
thr his ser lys leu gin arg ala phe asp ser ile val asn lys asp his lys arg met 



2041 |xxxxx IT AM xxxx | 2071 

TTT GGA ACC TAC TTC CGA GTT GGT TTC TTT GGA TCC AAA TTT GGG GAT TTG GAT GAA CAG 
phe gly thr tyr phe arg val gly phe phe gly ser lys phe gly asp leu asp glu gin 



2101 2131 



GAG TTT GTC TAC AAA GAG CCT GCA ATT ACC AAG CTT CCT GAG ATC TCA CAT AGA CTA GAG 
glu phe val tyr lys glu pro ala ile thr lys leu pro glu ile ser his arg leu glu 



2161 2191 

GCA TTT TAT GGT CAA TGT TTT GGT GCA GAA TTT GTG GAA GTG ATT AAA GAC TCC ACT CCT 
ala phe tyr gly gin cys phe gly ala glu phe val glu val ile lys asp ser thr pro 

2221 2251 

GTG GAC AAA ACC AAG TTG GAT CCT AAC AAG GCC TAC ATA CAG ATC ACT TTT GTG GAG CCC 
val asp lys thr lys leu asp pro asn lys ala tyr ile gin ile thr phe val glu pro 

2281 2311 

TAC TTT GAT GAG TAT GAG ATG AAA GAC AGG GTC ACA TAC TTT GAG AAG AAT TTC AAC CTC 
tyr phe asp glu tyr glu met lys asp arg val thr tyr phe glu lys asn phe asn leu 

2341 2371 

CGG AGG TTC ATG TAC ACC ACC CCG TTC ACC CTG GAG GGG CGG CCT CGG GGA GAG CTG CAT 
arg *ra phe met tyr thr thr pro phe thr leu glu gly arg pro arg gly glu leu his 

2401 2431 

GAG CAG TAC AGA AGG AAC ACA GTC CTG ACC ACT ATG CAC GCC TTC CCC TAC ATC AAG ACC 
glu gin tyr arg arg asn thr val leu thr thr met his ala phe pro tyr ile lys thr 

2491 | xxxxxxxxxxxxxxxxxxxxxxx 

AGG ATC AGC GTC ATC CAG AAG GAG GAG TTT GTT TTG ACA CCG ATT GAA GTT GCC ATT GAA 
arg ile ser val ile gin lys glu glu phe val leu thr pro ile glu val ala ile glu 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Coiled coil 1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
GAC ATG AAG AAG AAG ACC CTG CAG TTA GCA GTT GCC ATT AAC CAG GAG CCG CCT GAT GCA 
asp met lys lys lys thr leu gin leu ala val ala ile asn gin glu pro pro asp ala 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx I 2611 

AAG ATG CTT CAG ATG GTG CTG CAA GGC TCT GTG GGA GCT ACT GTA AAT CAG GGA CCA CTG 
lys met leu gin met val leu gin gly ser val gly ala thr val asn gin gly pro leu 

2641 2671 

GAA GTA GCC CAA GTG TTT TTG GCT GAA ATT CCT GCT GAT CCA AAA CTC TAT CGA CAT CAC 
glu val ala gin val phe leu ala glu ile pro ala asp pro lys leu tyr arg his his 

2701 2731 | xxxxxxxxxxx 

AAC AAG TTG AGG TTA TGC TTT AAG GAA TTC ATC ATG AGA TGT GGT GAA GCT GTA GAG AAA 
asn lys leu arg leu cys phe lys glu phe lie met arg cys gly glu ala val glu lys 

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Coiled coil 2 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
AAC AAG CGT CTC ATC ACG GCA GAC CAG AGG GAA TAT CAG CAG GAA CTC AAA AAG AAC TAT 
asn lys arg leu ile thr ala asp gin arg glu tyr gin gin glu leu lys lys asn tyr 



xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1 

AAC AAG CTA AAA GAG AAC CTC AGG CCA ATG ATC GAG CGG AAA ATT CCA GAA CTG TAC AAG 
asn lys leu lys glu asn leu arg pro met ile glu arg lys ile pro glu leu tyr lys 

2881 2911 

CCA ATA TTC AGA GTT GAG AGT CAA AAG AGG GAC TCC TTC CAC AGA TCT AGT TTC AGG AAA 
pro ile phe arg val glu ser gin lys arg asp ser phe his arg ser ser phe arg lys 

2941 2971 

TGT GAA ACC CAG TTG TCA CAG GGC AGC TAA GAA AAG CCA TCT TCA TTC GTG GAG ACT GTG 
cys glu thr gin leu ser gin gly ser OCR glu lys pro ser ser phe val glu thr val 

3001 3031 

GCC CTG CAA CCC TGG AGA AGG ACT TGC TGG TAC TTA AAA AAT GGG ACA TTT GCC ACC CAG 
ala leu gin pro trp arg arg thr cys trp tyr leu lys asn gly thr phe ala thr gin 

3061 3091 

GAC TGA CTG TAC ACT CCC TGA TCA GCC AGC ACT CTG GAA GCT TTG GGA TCC CAG GAA CCA 
asp STP 

3121 3151 

TGG AAT TAT TCC CAA ATG GAC TCT GAC CAG ATT TTT GCC ATA CTG GGG GGT GGC GGG ATG 
3181 3211 

GAG GAT GGG TAC TCA GGC ATG ACT GCG TAT TTA TTA AAG TGT GTT TTT CCA CAA TGT ACC 
3241 327 * 

AAA CAA GGC ATA AGC AGC TTC TCC TGC TGA CTG GCC AAT CAC TGC CCA TCT GAG AGA TGA 
3301 3331 

TTT CCT CTG GCC CAT ATT TGA ATT TAT TGG AGT AAC TCA AAT TGC CTG AGG AAA AAT GGA 
3361 3391 



AAA ATT ATC CAC 



CAG TCG ATT CAA ACT GAA TTT CAC TCT TTA TAG GAA GGC AGG GCA AAC 



3421 3 <S1 

TTG TAG GAG TAC GAA ACA TTT TCA ATA AAT CTA CAA AGG GAA GCC TTA CTA CAA TTC CAA 

3481 3511 

AAA TCA TCA TGG TTG GAA ATT TGG GAG GAG ATT ATT TGT GAA CTT GTT ACC CTT TTG GTA 

3541 3571 

ATG GTG GAC TAA TTG CTG TAT AGT TAT TTT TGT TTT ATT ATT ACT GTT ACA TTA ATT TAA 

3601 3631 

CAT GCA TTT ATA GAA GAA TAC ATT CAA AGC ACT GAT GTA GGA GAT ACA CGG TAC TTG GAG 



3661 



3691 



3721 3751 

CTT TGC TTT TTT TCT TAT GTC ACT CTT GTG TAC TAT CTA TTT TTC TCC TCT CTG GGA CCA 
3781 3811 

AGT TTC TTT TTA TAA AGC AAT AAT ATC TCT GTT TTC ATT TCA GAA CAT TGT GCT GTC TGT 
3841 3871 

CAG CAT ATG TAT ATC AGC TAC AAA ATA TAT TCA ACT TTG ACT TCT TTT GAC AAA GGA CTT 
3901 3931 

TAG GAA AAG GAG GAA CAA AGA CAT TAT TTG AGA ATT AAA TTA TAT ATT TTT AAT ATG ACT 
3951 3991 

GTG ACC TTG ACT GAT AAT AAA GAT GTA ATA AGA ATT GCA AGC TAA AAA AAA AAA AAA AAA 



4021 
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B 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



KC2A 

KIAA 

rat 

HC-1 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC 3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



ASGNLDKNARFSAI YRQDSNKLSNDDMLKLI^FRKPEKMAKLPVI LGNLDI TI DNVS SD 

FPNYVN S £ Y I P T K Q FE T C SK T P I T FE VEE FVPC I PKH T QP Y T I Y TNHL YVY PK YLK YDS Q 

VXHHHQNPE FYDE I K 

KS FAKARNIA I CI EFKDS DEE DSQPLKC I YGRPGG PVFTRSAFAAVLHHHQNPEFYDE I K 

IELPTCLHEKHHLLLTFFHVSCDNSSKGSTKKRDWETQVGYSWLPLLKDGRWTSEQHI 
IELPTQI2^KJiHLLLTFF7WSCDNSS 

PVSANLPSGYLGYQELGMGRHYGPE IKWVDGGKPLLKI STHLVS TVYTQDQHLHNFFQYC 
PVSANLPSGYLGYQELGMGJ^HYGPEIKWVDGGKPLLKISTHLVSTVYTQDQHLHNFFQYC 

GPGPARS TVS IS LI SNSARV 

QKTESGAQALGNELVKYLKSL14AMEGHVMI AFLPT I LNQLFRVLT-RATQEEVAVTJVTRV 
QK TE S GAQAL GNELVK YLKSLHAME GHVM I AFLP T I QL FRVL T-RATQEEVAVNVTRV 

1 QVL i RFLSVI LMQLFWVLPNM IHEDDVP I SCPMV 

MS PLP 1 1 LNQLFKVLV-QNEEDE ITTTVTRV 

NRSRSLSNSNPDI SGTPTSPDDEVRS 1 1 GSKGLDRSNSWVNTGGPKAAPWGSNPSPSAES 



A 



HC2A 
KIAA 
rat 
HC4 

HC1 
HC3 
HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC A 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

UCA 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC 4 

KC1 

HC3 

HC5 



I I HWAQi 
IIHWAQC 



chKg 



LESHLRSYVKYAYKAEPYVASEYKTVHEELT 
GLE S HLRS YVKYA YKAE P YVASE YK T VHEEL T. 



I LKPSADFLTSN 
TT I LKPSADFLTSN 



LFHIVSKCHEEGLDSYLSSFIKYSFRPGKPSAPQAPLIHETLATMMIALLKQSADFLAIN 

LPDI VAKCHEEQLDHSVQS Y IKFVFKTR ACKERPVHEDLAKNVTGLLK-SNDSPTVK 

TQAMDRSCNKMSSHTETSS FLQTLTGRLP TKKLFHEELALQWWCSG- -SVR E 



Cadherin 



KLLRYSWFFFDVL IKSMAQHL IENSKVKLL 
K L LKY S W F FFDVL I KSMAQH L I E NS KVKL L 


RNQP 
RNQP 


FPA5 YHHAAE TVVNMLMPH I TQK FG D 
FPASYHHAVETWNMLMPH ITQKFRD 


KLLKYSWFFFEI IAKSMATYLLEENKIKLT 
HVLKHSWFFFAI ILKSMAQHLIDTNKIQLE 
SALQQAW FFFELMV>;SMVHHLY FNDKLEAP 


HGQF 
RPQP 
RKSP 


FPKAYHHALHSLFLAI T - 1 VESQYAE 
FP E S Y QNE L DNL VMVL S DHV I WKY KD 
FPERFMDDIAALVSTIASDIVSRFQK 



NPEASKNANHSLAVFIKRCFTFMDRGFVFKQIN NYIS — CFAPGDPKTLFEYKFEFL 

NPEASKNANHSLAVFIKRCFTFMDRGFVFKQIN NYIS — CFAPGDPKTLFEYKFEFL 



I PKESRNVNYSLAS FLKCCLTLMDRGFVFNLIN DYIS — GFSPKDPKVLAEYKFEFL 

ALEETRRATHSVARFLKRCFTFMDRGCVFKMVN NYIS — MFSSGDLKTLCQYKFDFL 

liiTEMVERLNTSLAFFLNDLLSVMDRGFVFSLIKSCYKQVSSKLYSLPNPSVLVSLRLDFL 



PWCNHEHYI PLNLPM PFGKGRIQR YQDLQL DYSLTDEF 

RWCNHEHYJPLNLPM PFGKGRIQR YQDLQL DYSLTDEF 



QT I CNHEHYI PLNLPM AFAKPKLQR VQDSNL EYSLSDEY 

QEVCQHEHFI PLCLPI RSANI PDPLTPSES TQELHASDMPEYSVTNE F 

F.IICSHEHYVTLNLPCSLLTPPASPSPSVSSATSQSSGFSTNVQDQKIANMFELS--VPF 
MNADTAPTSPCPS I S SQNSSSCSS FQDQKIASMFDRTSRVPA 

Cadherin 



EC motif 



CRNHFLVGL 
CRNHFLVGL 


LLRE 
LLRE 


VGTALQEFRE VF.LIAISVLKNLLI KHS FDDRYASRSHQARIAT 

VGTALQEFRE VRL IAISVLKNLL I KHS FDDRYASRSHQARIAT 


CKHHFLVGL 
CRKHFLIG I 
RQQHYLAGL 
SSTS-SPGL 


LLRE 
LLRE 
VLTE 
LFTFj 


TS I ALQDNYE I RYTAI S VI KNLLI KHAFTJTRYQHKNQQAKI AQ 

VG FALQE DQD VRHLALAVLKNLMAKHS FDDRYRE PRKQAQ I AS 

LAV I LD P D A£ G L FG L HKKV I NMVHNL LS S H DS D P R Y S D PQ I KARVAM 
LAAALDAEGEG I SEVQRKAVSAI HS LLS SHDLDPRCVKPEVKVK I AA 



LYLPLFGLL I ENVQF. INVF DVS PFPVNAG-MTVKE'ESLALPAVNPLVTPQKGSTLDNSLH 
LYLPLFGLLIENVQRIN\T;DVSPFP\^G-MTVKI)ESLALPAVNPLVTPQKGSTLDNSLH 



LYLPFVGLLLEN IQRLAGRDTLYSCAAMPNSASRDEFPCG FTSP- -AN- -RGSLS 

LYMPLYGMLLDNMPRI YLKDLYPFTVNTSNQGSRDDLSTNGGFQSQTAIKHANSVDTSFS 

LYLPLIGI IMETVPQLYDFTETHNQRGRPI CIATDDYESE SG SMIS 

LYLPLVGI ILDALPQLCDFTVADTRRYR TSGSDEEQE GA GAIT 



A 



HC2- 

KIAA 

rat 

HC4 

HC1 

HC3 

HCS 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HCS 



KC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HCS 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HCS 



HC2A 
KIAA 
rat 
HC4 



KDLLGAISC^^PYTTSTPNINSVRNADSRGSLISTDSGNSLPE^^EKSNSLDKHC?QSS 
KDLL GA 1 P Y T T S T PN I N S VRNADS R GS L I S T DS GN S LP^^»<SNS LDKH QQS S 



TDKDTAYGSFQNG HG I KREDSRGSLI P-EGATGFPDQGNTGEN TRQS 

KDVLNS I AAFS S IAISTVNHADSRASLASLDSNPSTNEKSSEKTDNCEKIPRPL 

QTVAMAI AGTSVPQ LTRPGSFLLTSTSGRQHT 

QNVALAIAGNNFN LKTSG- IVLSSLPYKQYN 



T L GNS WR C DK L DQ S E I K S L LMC FL Y I LKSMS DDAL FT YWN - KAS T S E LMD F FT I S E VC L 
T LGNSWRCDKLDQSE I KS LLMC FLY I LKSMS DDALFT YWN- KASTSELMDFFTI SEVCL 



STRSSVSQYNRLDQYEIRSLl^CYLYIVKMISEDTLLTYWN-KVSPQELINILILLEVCL 
AL I GSTLR FDRLDQAE TPS LLMC FLHIMKT I S YE TL I AYWQ- RAPS PEVSDFFS I LDVCL 

TFSAESSRSLLICLLWVLf^-ADETVLQKWFTDLSVLQLNRLLDLLYLCV 

MLNADT TPNLM I C FLW IMKN-ADQSLI RKW I ADLPS TQLNR I LDLL F I CV 



HQFQYMGKRY IARNQEGLG — PI VHDRKS QTLPVSRNRTGMM 

HQFQYMGKRYIAR TGMM 



FHFRYMGKRNIARVHDAWLSKHFGIDRKS QTMPALRNRSGVM 

QNFRYLGKRNI I RKIAAAF — KFVQSTQNNGTLKGSNPSCQTSGLLAQWMHSTSRHEGHK 

SC FEYKGKKVFERMNSLTFK — KSKDMRAK LEEAI LGS I GARQEMV 

LCFEYKGKQSSDKVSTQVLQ — KSR-DVKAR LEEALLRGEGARGEMM 



HARLQQL GSLDNS LTFNHSYGHSDADVLHQSLLEANIATEVC 

HARLQQL GSLDNS LTFNHSYGHSDADVLHQSLLEANIATEVC 



QARLQHL SSLESS FTLNHSSTTTEADI FHQALLEGNTATEVS 

QHRSQTLPI IRGK NALSNPKL LQMLDNTMTSNSNE I DI VHHVDTEAN I ATEGC 

RRSRGQLERSPSGSAFGSQENLRWRKDMTHWRQNTEKLDKSRAEIEHEALI DGNLATEAN 
RRRAPGNDRFP GLNENLRWKKEQTHWRQANEKLDKTKAELDQEAL I SGNLATEAH 



LTALDTLSLFTI^FKNQLIAI)HGHNPLMKK 
LTALDTLSLFTLAPK>JQLLAI^HGHNPLMKK^ro 

KLSRGHS PLMKKVFDVYLC FLQKHQSEMALKNVFTALRS L I Y 

LTVLDTISFFTQCFKTHFL^DGHNPI>IKK^FDIHLAFLKNGQSEVSLKHVFASLRAFIS 
LTILDLVSLFTQTHQRQLQQCDCQNS1>IKKGFDTYMLFFQVNQSATALKHVFASLRLFVC 
L 1 1 LDTLE I WC'TVS - - VTES - -KES I LGGVLKVLLHSMACNQSAVYLQHCFATQRALVS 
LI ILDMQENI IQASS — ALDC — KDSLLGGVLRVLVNSLNCDQSTTYLTHCFATLRALIA 



K F P S T FTE GRADMCAAL C Y E I L KC CNS KLS S I R TEAS QLL Y FLMRNN FD Y T GKKS FVRT H 
P^FPSTFYEGPJVDMCAALCYEILKCCNSKLSSIRTEASQI^ 

K F P S T FYE G RADMCAS L C YE VLKC CNS KLS S I RTEAS QLL Y FLMRNN FD Y T GKKSFVRTH 
V. F P S A F FK GR VNM C AA FC YE VLK C C T S K I S S T RNEAS ALL Y LLMRNN FE Y T KRKT FLR T H 
FFPSAFFQGPADLCGS FCYEVLKCCNHRSRSTQTEASALLYLFMRKNFEFNKQKS I VRSH 
KFPELLFEEETEQCAr , LCLRLLRHCSSSIGTIRSHPSASLYLLMRQNFEIGN--NFARVK 
KFGDLLFEEEVEQCFDLCHQVLHHCSSSMDVTRSQACATLYLLMRFS FGATS — NFARVK 
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ref 1.1 

1 31 
TCG ACT ATG AAT GCT GAT ACT GCT CCA AC A TCT CCT TGT CCT TCC ATA TCT TCC CAG AAC 
ser thr met asn ala asp thr ala pro thr ser pro cys pro ser ile ser ser gin asn 



61 91 

TCA AGC TCC TGC TCC AGC TTC CAG GAC CAG AAG ATC GCC AGC ATG TTC GAT CGG ACT TCC 
ser ser ser cys ser ser phe gin asp gin lys ile ala ser met phe asp arg thr ser 

121 151 

AGA GTA CCC GCC AGC AGC ACT TCC TCA CCG GGG CTC CTC TTC ACA GAA CTG GCT GCT GCC 
arg val pro ala ser ser thr ser ser pro gly leu leu phe thr glu leu ala ala ala 

181 211 

CTG GAT GCC GAA GGG GAA GGA ATC AGC GAA GTA CAA AGG AAA GCT GTC AGT GCA ATT CAC 
leu asp ala glu gly glu gly ile ser glu val gin arg lys ala val ser ala ile his 

241 271 

AGC CTG CTA AGT TCT CAC GAC CTG GAC CCA CGC TGT GTC AAA CCA GAG GTG AAG GTC AAA 
ser leu leu ser ser his asp leu asp pro arg cys val lys pro glu val lys val lys 

301 331 

ATC GCC GCC CTT TAC CTA CCT TTA GTT GGC ATC ATT TTG GAT GCT TTG CCA CAG CTC TGT 
ile ala ala leu tyr leu pro leu val gly ile ile leu asp ala leu pro gin leu cys 

361 391 

-GAC TTT ACA GTT GCA GAT ACT CGC AGA TAC CGC ACC AGT GGC TCG GAT GAA GAA CAA GAA 
asp phe thr val ala asp thr arg arg tyr arg thr ser gly ser asp glu glu gin glu 

421 451 

GGA GCC GGT GCC ATT ACC CAG AAT GTG GCT CTG GCC ATA GCA GGG AAT AAT TTC AAT TTG 

glv ala gly ala ile thr gin asn val ala leu ala ile ala gly asn asn phe asn leu 

I ref 2 . 1 

461 ♦ 511 

AAA ACA AGT GGA ATA GTG CTG TCT TCC TTG CCC TAT AAG CAG TAC AAC ATG CTG AAC GCG 

lys thr ser gly ile val leu ser ser leu pro tyr lys gin tyr asn met leu asn ala 

541 571 

GAC ACT ACT CGC AAC CTC ATG ATC TGC TTC CTC TGG ATC ATG AAA AAT GCT GAT CAG AGC 
asp thr thr arg asn leu met ile cys phe leu trp ile met lys asn ala asp gin ser 

601 631 

CTC ATT AGG AAG TGG ATT GCT GAC CTG CCA TCA ACG CAG CTC AAC AGG ATT TTA GAT CTA 
leu ile arg lys trp ile ala asp leu pro ser thr gin leu asn arg ile leu asp leu 

661 691 

CTT TTC ATC TGT GTG TTA TGT TTT GAG TAT AAG GGA AAA CAG AGT TCT GAC AAA GTC AGT 
leu phe ile cys val leu cys phe glu tyr lys gly lys gin ser ser asp lys val ser 

721 751 

ACC CAA GTC CTG CAG AAG TCA AGG GAT GTC AAG GCC CGG CTG GAA GAG GCT TTG CTG CGT 
thr gin val leu gin lys ser arg asp val lys ala arg leu glu glu ala leu leu arg 



A 



841 

GGC CTA AAT GAA AAT TTG AGA TGG AAG AAA 
gly leu asn glu asn leu arg trp lys lys 

901 

AAG CTA GAT AAA ACA AAG GCC GAG TTA GAT 
lys leu asp lys thr lys ala glu leu asp 

961 

ACA GAA GCA CAT TTA ATC ATC CTG GAT ATG 
thr glu ala his leu ile ile leu asp met 

1021 

CTG GAC TGT AAA GAC AGC CTG CTG GGA GGT 
leu asp cys lys asp ser leu leu gly gly 

1031 

TGT GAT CAG AGT ACC ACC TAC CTG ACT CAC 
cys asp gin ser thr thr tyr leu thr his 

114 1 

AAG TTT GGA GAC TTA CTC TTC GAA GAG GAG 
lys phe gly asp leu leu phe glu glu glu 

1201 

GTC CTG CAC CAC TGC AGC AGC AGC ATG GAT 
val leu his his cys ser ser ser met asp 

1261 

TAC CTC CTC ATG AGG TTC AGT TTT GGA GCC 
tyr leu leu met arg phe ser phe gly ala 

1321 

GTA ACC ATG TCC CTG GCA TCT TTG GTG GGA 
vsl thr met ser leu ala ser leu val gly 

1381 

AGA AGA TCC TTG AGG ACA ATT TTG GCC TAT 
arg arg ser leu arg thr ile leu ala tyr 

1441 

CCT TTT CCC ACC CAG GTG GAG GAA CTT CTC 
pro phe pro thr gin val glu glu leu leu 

1501 

GTG AAA. ATG AGG GAA TTT CAG GAA GAT CCT 
val lys met arg glu phe gin glu asp pro 

1561 

GCC AAG AGT TAC CAG GCA TCT CCT GAT CTG 
ala lys ser tyr gin ala ser pro asp leu 

1621 

r.nn. rn.r Ar n AAG AAG AAG TGC TAC ACC GAG 




871 

GAG CAG ACA CAT TGG CGG CAA GCT AAT GAG 
glu gin thr his trp arg gin ala asn glu 

931 

CAA GAA GCC TTG ATC AGT GGC AAT CTG GCT 
gin glu ala leu ile ser gly asn leu ala 

991 

CAG GAA AAC ATT ATC CAG GCG AGC TCG GCT 
gin glu asn ile ile gin ala ser ser ala 

1051 

GTT CTG AGG GTG CTG GTG AAT TCT CTG AAC 
val leu arq val leu val asn ser leu asn 

Iref 3.1 

TGC TTT GCA ACA CTC CGT GCT CTC ATC GCC 
cys phe ala thr leu arg ala leu ile ala 

1 1 7 1 

GTG GAA CAG TGT TTC GAC CTA TGT CAC CAA 
val glu gin cys phe asp leu cys his gin 

1231 

GTC ACC CGG AGC CAA GCC TGT GCC ACC CTT 
val thr arg ser gin ala cys ala thr leu 

1291 

ACC AGT AAT TTT GCA AGA GTA AAG ATG CAA 
thr ser asn phe ala arg val lys met gin 

1351 

AGA GCA CCA GAC TTT AAT GAA GAG CAC CTG 
arg ala pro asp phe asn glu glu his leu 

1411 

TCA GAA GAG GAC ACA GCC ATG CAG ATG ACT 
ser glu glu asp thr ala met gin met thr 

1471 

TGT AAT CTG AAT AGC ATC TTA TAT GAC ACA 

cys asn leu asn ser ile leu tyr asp thr 

1531 

GAG ATG CTT ATG GAT CTC ATG TAC AGA ATT 
glu met leu met asp leu met tyr arg ile 

1591 

CGG CTG ACC TGG CTC CAG AAC ATG GCA GAG 
arg leu thr trp leu gin asn met ala glu 

1651 

GCT GCC ATG TGC CTG GTG CAC GCC GCT GCG 



A 



1681 

TTA GTG GCT GAG TAT CTG AGC ATG CTG GAG 

leu val ala glu tyr leu ser met leu glu 

1741 

AGC TTC CAG AAT ATT TCT TCC AAT GTG CTG 
ser phe gin asn ile ser ser asn val leu 

1801 

TCA CCT GAC GAG GAT GGG GTG TGC GCA GGC 
ser pro asp glu asp gly val cys ala gly 

1861 

CTC CTG GAG CAG GCC GCG GAG CTC TTC AGC 
leu leu glu gin ala ala glu leu phe ser 

1921 

GTC TAC AAG CTG GTC ATC CCC ATC CTA GAA 
val tyr lys leu val ile pro ile leu glu 

1981 

ACT CAC AGC AAG CTG CAG AGA GCC TTC GAC 
thr his ser lys leu gin arg ala phe asp 

204 1 

TTT GGA ACC TAC TTC CGA GTT GGT TTC TTT 
phe gly thr tyr phe arg val gly phe phe 

2 1 0 1 

GAG TTT GTC TAC AAA GAG CCT GCA ATT ACC 
glu phe val tyr lys glu pro ala ile thr 

GCA TTT TAT GGT CAA TGT TTT GGT GCA GAA 
ala phe tyr gly gin cys phe gly ala glu 

| ret 4.1 
22X1 ▼ 

GTG GAC AAA ACC AAG TTG GAT CCT AAC AAG 
val asp lys thr lys leu asp pro asn lys 

2281 

TAC TTT GAT GAG TAT GAG ATG AAA GAC AGG 
tyr phe asp glu tyr glu met lys asp arg 

2 3 4 1 

CGG AGG TTC ATG TAC ACC ACC CCG TTC ACC 
arg arg phe met tyr thr thr pro phe thr 

2401 

GAG CAG TAC AGA AGG AAC ACA GTC CTG ACC 
glu gin tyr arg arg asn thr val leu thr 

2461 

AGG ATC AGC GTC ATC CAG AAG GAG GAG TTT 
arg ile ser val ile gin lys glu glu phe 




1711 

GAC CAC AGC TAC CTG CCC GTG GGC AGT GTC 

asp his ser tyr leu pro val gly ser val 

1771 

GAG GAG TCT GTG GTC TCT GAG GAC ACC CTG 
glu glu ser val val ser glu asp thr leu 

1831 

CAG TAC TTC ACC GAG AGT GGC CTG GTA GGC 
gin tyr phe thr glu ser gly leu val gly 

1891 

ACG GGA GGC TTA TAT GAG ACA GTT AAT GAG 
thr gly gly leu tyr glu thr val asn glu 

1951 

GCG CAT CGA GAA TTC CGG AAG CTG ACA CTC 
ala his arg glu phe arg lys leu thr leu 

2011 

AGC ATC GTT AAC AAG GAT CAT AAG AGA ATG 
ser ile val asn lys asp his lys arg met 

2071 

GGA TCC AAA TTT GGG GAT TTG GAT GAA CAG 
gly ser lys phe gly asp leu asp glu gin 

2131 

AAG CTT CCT GAG ATC TCA CAT AGA CTA GAG 
lys leu pro glu ile ser his arg leu glu 

2191 

TTT GTG GAA GTG ATT AAA GAC TCC ACT CCT 
phe val glu val ile lys asp ser thr pro 

2251 

GCC TAC ATA CAG ATC ACT TTT GTG GAG CCC 
ala tyr ile gin ile thr phe val glu pro 

2311 

GTC ACA TAC TTT GAG AAG AAT TTC AAC CTC 
val thr tyr phe glu lys asn phe asn leu 

237 1 

CTG GAG GGG CGG CCT CGG GGA GAG CTG CAT 
leu glu gly arg pro arg gly glu leu his 

24 3 1 

ACT ATG CAC GCC TTC CCC TAC ATC AAG ACC 
thr met his ala phe pro tyr ile lys thr 

2491 

GTT TTG ACA CCG ATT GAA GTT GCC ATT GAA 
val leu thr pro ile glu val ala ile glu 



% 



GAC ATG AAG AAG AAG ACC CTG CAG TTA GCA 

asp met lys lys lys thr leu gin leu ala 

2581 

AAG ATG CTT CAG ATG GTG CTG CAA GGC TCT 
lys met leu gin met val leu gin gly ser 

2641 

GAA GTA GCC CAA GTG TTT TTG GCT GAA ATT 
glu val ala gin val phe leu ala glu ile 

2701 

AAC AAG TTG AGG TTA TGC TTT AAG GAA TTC 
asn lys leu arg leu cys phe lys glu phe 



2761 

AAC AAG CGT CTC ATC ACG GCA GAC CAG AGG 
asn lys arg leu ile thr ala asp gin arg 



2821 

AAC AAG CTA AAA GAG AAC CTC AGG CCA ATG 
asn lys leu lys glu asn leu arg pro met 



2381 

CCA ATA TTC AGA GTT GAG AGT CAA AAG AGG 
pro ile phe arg val glu ser gin lys arg 



2941 

TGT GAA ACC CAG TTG TCA CAG GGC AGC TAA 
cys glu thr gin leu ser gin gly ser OCH 
Jret b . 1 

C-CC CTG CAA CCC TGG AGA AGG ACT TGC TGG 



30 61 

GAC TGA CTG TAC ACT CCC TGA TCA GCC AGC 



3121 

TGG AAT TAT TCC CAA ATG GAC TCT GAC CAG 



3181 

GAG GAT GGG TAC TCA GGC ATG ACT GCG TAT 



32 4 1 

AAA CAA GGC ATA AGC AGC TTC TCC TGC TGA 



3301 

TTT CCT CTG GCC CAT ATT TGA ATT TAT TGG 



3361 

AAA ATT ATC CAC CAG TCG ATT CAA ACT GAA 



3421 

TTG TAG GAG TAC GAA ACA TTT TCA ATA AAT 




2551 

GTT GCC ATT AAC CAG GAG CCG CCT GAT GCA 

val ala ile asn gin glu pro pro asp ala 

2611 

GTG GGA GCT ACT GTA AAT CAG GGA CCA CTG 
val gly ala thr val asn gin gly pro leu 

2671 

CCT GCT GAT CCA AAA CTC TAT CGA CAT CAC 
pro ala asp pro lys leu tyr arg his his 

2731 

ATC ATG AGA TGT GGT GAA GCT GTA GAG AAA 
ile met arg cys gly glu ala val glu lys 

2791 

GAA TAT CAG CAG GAA CTC AAA AAG AAC TAT 
glu tyr gin gin glu leu lys lys asn tyr 

2851 

ATC GAG CGG AAA ATT CCA GAA CTG TAC AAG 
lie glu arg lys ile pro glu leu tyr lys 

2911 

GAC TCC TTC CAC AGA TCT AGT TTC AGG AAA 
asp ser phe his arg ser ser phe arg lys 

2971 

GAA AAG CCA TCT TCA TTC GTG GAG ACT GTG 



3031 

TAC TTA AAA AAT GGG ACA TTT GCC ACC CAG 
3091 

ACT CTG GAA GCT TTG GGA TCC CAG GAA CCA 
3151 

ATT TTT GCC ATA CTG GGG GGT GGC GGG ATG 



3211 

TTA TTA AAG TGT GTT TTT CCA CAA TGT ACC 



327 1 

CTG GCC AAT CAC TGC CCA TCT GAG AGA TGA 



3331 

AGT AAC TCA AAT TGC CTG AGG AAA AAT GGA 



3391 

TTT CAC TCT TTA TAG GAA GGC AGG GCA AAC 



3451 

CTA CAA AGG GAA GCC TTA CTA CAA TTC CAA 



A 



AAA TCA TCA TGG TTG GAA ATT TGG GAG GAG 
3541 

ATG GTG GAC TAA TTG CTG TAT AGT TAT TTT 
3601 

CAT GCA TTT ATA GAA GAA TAC ATT CAA AGC 
3661 

CAG TCA GCC AAA AAT CAC AGA TAC TGC TTT 
3721 

CTT TGC TTT TTT TCT TAT GTC ACT CTT GTG 
3781 

AGT TTC TTT TTA TAA AGC AAT AAT ATC TCT 
3841 

CAG CAT ATG TAT ATC AGC TAC AAA ATA TAT 
3901 

TAG GAA AAG GAG GAA CAA AGA CAT TAT TTG 
3961 

GTG ACC TTG ACT GAT AAT AAA GAT GTA ATA 

4021 
AAC TCG 




ATT ATT TGT GAA CTT GTT ACC CTT TTG GTA 

3571 

TGT TTT ATT ATT ACT GTT ACA TTA ATT TAA 
3631 

ACT GAT GTA GGA GAT ACA CGG TAC TTG GAG 
3691 

CAC TTA AAT GGA AAC AAT TCT CCG ATA ATG 
3751 

TAC TAT CTA TTT TTC TCC TCT CTG GGA CCA 
3811 

GTT TTC ATT TCA GAA CAT TGT GCT GTC TGT 
3871 

TCA ACT TTG ACT TCT TTT GAC AAA GGA CTT 
3931 

ana aTT aaa tts TAT ATT TTT AAT ATG ACT 
3991 

AGA ATT GCA AGC TAA AAA AAA AAA AAA AAA 



References 

BAC sequences of Human CLASP 5 
Ref 1.1 

Sequence of BAC 1 9 using primer HC5S 1 1 , which spans nucleotides 3-22 of the cDNA. Exon 
sequence is underlined and represents nucleotides 32-57. 
CTCTCTGT CTTCATATCTTCCAG GTT AT AA AGNATE 
TATTTCATTTAACTAGCTCAGTTTAATCATC 

TTGACAAAACAATCAAACAATTCAAACCAGATCAAGTATGCTACCCTGAAGTTACACC 

ACTAGCTAAGAATTAACAATCTAAGTAATTGGTTTCTCCCCAGGCTCAAGGCTCCCTGA 

TCAGGTTAAGTAAAGCCAAGAATCCAATAAGCCCTATGAAATTTAGAAACTCATAGAA 

AAGTCrCAAATCTTCTTGTCTGACATTAGCCAATTGTTATATTATGCAAATAGAGGATT 

NCAAGTAAATAAGTTTGGAACCTGTTTACCAGGTTTTTGCAGCAGNCCTCTAAGAGCTT 

AACTGGTCATGCATTGAATGCCGAGAGCAAAGAGGAATGGAGAGGGGNTGTAAGNGG 

TTCCAATNTTACTGGAACCCACCACT 

GNCT(mTAGGCCTOTANTAANTAGAATCTATATGGATTCGTGTTCTGTCNGCAAGNAG 
TGCCTATGAAA 



A 



* 

Ref2.1 

Sequence of BAC19 using primer HCSASlOb, which spans nucleotides 560-580 of the cDNA. 
Exon sequence is underlined and represents nucleotides 510-553. 

TnCGAGTAGTGTCCGCGTTCAGCATGTTGTACTGCTTATAGGGCT GAAGGGAGGCACG 
ATTGGGGGATGGAGGCCAGGGAAGAAGTCAAGCAACAGAAAAATTTTGAGGCTTAACA 
GTCAAGCAACAGAAAAATTCAAAGTGTTCTCTTAAAATACCATGACTGTACATCACTG 
CTAGGCTGGAGATCTATTGCCAGTAGCCCTGCCTTCCCTAGGCAGGGGAAGCTGTGTT 
CTTTGAGTAGCGCTACTCAGCAAAGAGGCTCACCTGGGGCAGTATTTGAGCTAGGCTT 
TCAGCCACCGTATCTGAGTACCTCTGTCTTANGAGCAGTGTGGCCTGGTGATCACCCCT 
GGGCCTTGATCATGCGTGCTGCAATCCCAGTGATACAAAGAGGCTTTCATGCTGCTAA 
GATCTCCAAGTATTTCTCCTTCGTGCTGGGCAGCAGAGGGTTAGACTTNCAGGGGAGA 
AGGAAACTGGCTGGGTGCCATGAATAANCTTGCTGTTCAAGAmTAACTTCTTTGTTAC 
ATAAGNGCAAAGGTATAACATAAAGGGNCATGAACTGCTCAACNAAATTNATCAAAT 
CCATGTTTGTGGGAGTTCTTTTGTNATNGGAAGTTTAACCCCTAA 




Ref 3.1 

Sequence of BAC13 using primer C5S3, which spans nucleotides 1086-1 105 of the cDNA. Exon 
sequence is underlined and represents nucleotides 11 10-1 120. 
CCC NG CTCTTTTTG GC A A NGTA A NCTTGG 

NAAGAN TlTlTn AGCTTCATACTTCrcrCTTCAGGGGGACCAAAAGTCACAGAGCATA 

TTAAGTGGCANAACCCCNAAGGTCTTAAGTCTTCCTAGGAAGAAAGCAGATGCCCTGA 

TTCTGTGGGAAGCCACCATGGAGAGGAAAAGCAGTGGCTCCCATATTTGAAGTGNGGA 

CCTAACTCTAGAAAGTTTAAAANGGCCATTTGCTGAAGGGCTATGACATGAGAACAGA 

GATCAACTGAGTGACTTAGCAA^^TCACTCTTTCTCTGTAANACCTCTGGTGAGTGAGA 

NTAAATCCTNTATGTGACGCCCATTAGTCTTACAAAANGTCATGCCNTAAAATGCCAN 

GAAGGNCAGAAATGAATTTCTCACCGCCNGAGGAATGAGGATTATNCTGGGGGGACA 

TGCANAAATATTNNNCCCCCNATTTATTNATTTATTTATTTTTGAGACNGAGTW 

CTAATCGCCCCCAGGCTGGNAGGTGGNAGGTGGTTCCCATCTTNAANCTTANNTNGGA 

AGGNCCTCTTTGNGCCCCNGGGGGGNGNAAAGNGAATTCCCTAAATGCCTNCANNCCC 

CTCCCTGGANGTTATTTGGGGGNNTTNTAAAGGGCNGTGGCNG 



Ref 4.1 

Sequence of BAC13 using primer C5S7, which spans nucleotides 2196-2205 of the cDNA. . Exon 
sequence is underlined and represents nucleotides 2225-2231. 

ACAAAAACT AACCATCANTCTCTAAATCCCAACAANCl'n 1T1TAAGAATACCTAANG 
AGCTCAACNAGGGGGACTNTCCAANGCACTTAAATGCAGNCAAACNACNCCNNCAAG 
AGNGGCAACTACTAATGGGGCANATCTNAAAGAAAATATAGNCAAAGGNNGGAATCA 

t \ a t a n r, a H r y a nr \ r^rv \ vn a a n. n a \ s rrn, n.nn \ mrr.n, a a rrr NMTv^rrvv- 



A 



TTCCTANNNTAGAGANGAGANAACTGGGGACATGGGAAGAGGNAAGCGAAGGGTTCA 

AGGGGANGNAAGCGAGCAGANNCCAGGGNCTCANACTNGNGGGGNNTGGGGGGNTN 
CTGNNNCCCTACNCTTNGNANGAACAGNGN^^ 

Ref 5.1 

Sequence of BAC13 using primer 122047F1, which spans nucleotides 3537-3556 of the cDNA. 
Exon sequence is underlined and represents nucleotides 3000-3492. This region does not contain a 
intron in this region 

CCANNAGATT^^TGNAACGNNGGTAGGCTTCCTTTGTAGATTTATTGAAAATGTTTCGT 
ACTTCTACAAGTTTGCCCTGCCT^ 

TGGATAATTNTTCCATTTTTCCTCAGGCAATTTNGAGTTACTCCAATAAATTCAAATAT 

GGGCCAGAGGAAATCATCTTTCAGATGGGCAGTGATTGGCCAGTCAGCAGGAGAAGC 

TGCTTATGCCTTGTTTGGTACATTGTGGAAAAACACACTTTAATAAATACG 

CCTGAGTACCCATCCTCCATCCCGCCACCCCCCAGTATGGCAAAAATCTGGTCAGAGT 

CCATTTGGGAATAATTCCATGGTTCCTGGGATCCCAAAGCTTCCAGAAGTGCTGGCTG 

ATCAANGGAGTGTACAGTCAGTCCTGGGTGGCAAAAATGTCCCATTTTTTAAGTACCA 

AGCAAAGGTTCCTTCTTNCAAGGGTTNCTAGGGCC 



V 



Figure ^| 

Mult, i p I e sequence alignm^^Bof Human CLASP proteins with int^^Wexon borders 
indicated by a vertical r^ffe . Numbers in right margin corre^^d to References 



HC2A 

KIAA ASGNIJ)KNARFSAJYRQDSNKLSNDmiJ^^ 

rat 

HC4 

HCl 

HC3 

HC5 



HC2A 

KIAA FPNYVNSSyiPTKQFETCSKTPITFEVEEFVPCIPKHTQPyTIYTNHiyVYPKyiJCYDSQ 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A VLHHHQNPEFYDEIK 

KIAA KS FAKARN IAI C I EFKDSDEEDS QPLKC I Y GRPG GP VFTRSAFAA VLHHH QNPE FYDEIK 

HC4 

HC1 

HC3 

HC5 



HC2A lELPTQI^KHHLLLTFFWSCDNSSKGSTKKRDVVETQVGYSWLPLLKDGRVVTSEQHI 

KIAA IELPTQT.KEKHHI J.T.TFFIJVSCDNSSKGSTK^ 

rat 

HC4 

HCl 

HC3 

HC5 



HC2A PVSA^PSGYLGYQELGMGRHYGPEIK>m>3GKPLLKISTHLVSTVYTQDQHLHNFFQYC 

KIAA P VSANLPS G YLG Y QELGMGRHY GPE I KWVDGGKPLLKI S THL VS TVYT QDQHLHNFFQ YC 

rat 

HC4 

HCl 

HC3 GPGPARSTVSISLISNSARV 

wrs 



HC2A QKTESGAC^GNELVKYLKSLHAttEGHVMIAELPT^ 

KIAA QKTESGAQAU^LVXYIJ<SIJiAMEGHVM 

rat 

HC4 ME I QVL I RFLS VI LMQL FWVLPNM I HEDDVP ISCPMV 

HCl MSFXP1ILNQLFKVXV-QNEEDEITTTVTRV 

HC3 NRSRSLSNSNPDISGTPTSPDDEVR5I IGSKGLDRSNSWVKTGGPKAAPWGSNPSPSAE5 

HC5 



B 



HC2A 

k:aa 

rat 
HC4 
HC1 

HC3 
HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 
KIAA 
rat 
HC4 

HC1 
HC3 

HC5 



HC2A 

KIAA 

rat 

HC A 

HC1 

HC3 

HC5 



HC7A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 
KIAA 
rat 

HC A 
HC1 
HC3 
HC5 



I I HVVAOaBt GLES HLRS YVXYAYKAE ? YVA5 E V F7VHEE L7^fc7 I LKPSATFL7SN 

I I HVYA GUE 5 HLRS YVXYA Y KAE ? YVA5 E Y K 7 VKEE 1 7^^ 7 : IX? SADF1 7 5 N 

LFTm'SKCHEEGL^SYLSSFIKYSFRPGKPSAPCAPLIHETI^TMMIALL^^ 

LPDI VAKCHEEQLDHSVQSYI KFVFKTR AC KE R ? VHE DLAKNVT GLLK-SNDS? 7YK 

7 QAM 0R5 CNRMS SH7ETSSFLQTL7GRL? 7KX1 FHE E LALQWYVCSG - - S VR £ 



Cai-.er :r. 
Cl eava ge 



KLLRYSW FFFDVL I KSMAQHL I ENSKVKLL 
KLLKYSW FFFDVLI KSMAQHL I ENSKVKLL 


RNQF 
RNQF 


F PAS YKHAAE TWNM LM PH I T QK FG D 
FPAS YHHAVE TWNMLMPH I TQK FRD 


KLLKYSW FFFE I IAKSMATYLLEENKI KLI 
HVLhiiSWFFFAI I LKSKAQHLI DTNK IQLE 
SALOCAW FFFELM*/KSMVHHLY FNDKLEAF 


HGQF 
RPQP 
RKSP 


FPKAYHHALHSLFLA1 T - 1 VESQYAE 
FPESYQNELDNLVMVLSDHVIWKYKD 
FPERFMDDIAALVST IASDI VSRFQK 



NPEASKNANHSLAVFIKRCFTFMDRGFVFKQIN NYIS--CFAPGDFf<TLFEYKFEFL 

NPEASKMANHSLAVFIKRCFTFMDRGFVFKQIN NYI S --CFAPGDPKTLFEYKFEFL 



ipkesrnvnyslasflkccltlmdrgfvfnlin dyis — g fs p kd p kvtae y k fe fl 

alee trraths var fu^cft fkdrgcvfkmvn ny is — mfss gdlktlcqykfd fl 

dtemveri^tsi^ffxndllsvmdrgfvfslik^cykqvssklyslpnpsvtJvslrldfl 



R WCNHE H Y I P UN L ?M ? FGKGRI CjR Y Q 7 L C L D Y S L ~ 7 E r 

R WCNrZ-i Y J ? UN 1 ?y ? FGX GR I 2*R Y 5l ; L D Y S L 7 7 E F 



QEVCCHEHFI PLCL? IPJ5ANI PPPLTPSES TQELHA5DMPEYSV7NEF 

RiICSHEHYVTl^U?C5LL7PPASPSPSVSSAT^QSSGFS7N , /QDQKIANMFELS--VPF 
MNADTAPTSPCPSIS S <£fS SS CSS FQDQK I ASM FDRTSRV PA 

Cadherin 



EC motif 



CRNHFLVGL 
CRNHFLV3L 


LLRE 


VGTALQE FRE VR L Z A I 5 VL KN L L I KH S FD 7 R YAS FISHQAR I A 7 


CKHHFLVGL 
CRXHFLlfcl 
RQQHYLAGL 
SSTS-SPGL 


LLRE 
LLRE 
VLTE 
LFTF 


TS IALQDNYE I RYTAI S VI KNLLIKHAFDTRYQHKNOQAKI AQ 

VGFALQEDOD VRiiLALAVLKWI>4AKHSFDDRYKEPRK0A0IAS 

LAV I LD P DAEG L FG LKKKV I NMVHN L LS SHDS DP R Y S DPQ I KAR VAM 
IJ l AALDA£GEGISE^QRXAVSAIHSLLSSHDLDPRCVKPEVIC\ r KIAA 



LY L P L FG L L I ENVQR I NVRDVS P FPVNAG -MTVXPE S LAL PA VN PL VT PQKGS T LDNS LH 
LYLPLFGLLI ENVQR I NVRD VS P FP VNAG - MTVKPE S LAL PA VNPL VT P QKG S 7 LDN 5 LH 



LYLPFVGLLLENIQRL^GRDTLYS CAAMPNSAS R PE FPC G F~ 5 P - - AN - - R 7- 5 L 5 

LYMPLYGMLLDNMPRIYI^DLYPFTVNTSNCjGSRDDLSTNG^FQSQTAIKHANSVP75FS 

LYLPLI3I IME7VPQLYDFTETHNCRGRP I CIA7DDYESE SG SHIS 

LYLPLVGI I LDALPQLCDFTVADTRRYR 7SGSDEEQE GA GAIT 



B 



hc:a 
k : a- 

rat 
HC4 

hc: 

HC3 

HC5 



hc:a 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



HC2A 
KIAA 
rat 
HC - 

hc: 

HC? 
HC^ 



HC2A 

KIAA 

rat 

HC 4 

HC1 

HC 3 

HCr 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC^ 



r'.: - L^: S3^AJ FVTTiT PNI N5^NAT>SRG£L :5TDSGNSLPERNSEK5NSLDh|HCCi5 
rOLLGA T jfl^S P Y T T 5 7 ?N : N5 VRNAPSRG 5 LIST DSGNSLPEjfcjEKSNS LDKH 00? 5 

77KL7AYGSrQNG HG IKKEDSRGSL I P-EGATGFPDQGNTGEN TRQS 

KT/VLNSI^FSS IAIST/NHAD5RASLASLDSNPSTNEKSSEKTDNCEKIPRPL 

1 7 V AMA I AG T S V PQ LTRPGSFLLTSfrSGRQHT 

2 W ALA I AGNN FN LKTSG- I VLSSJLPYKQYN \ 



7LGNS \"YR C 2KL DQS Z I KS 1 LM 7 FL Y II KSMS ? DAL FT YWN - KAS TSELMDFFT 1 5E7CL 
71GNSV\^CDKLD0SE IKSLL«CFLYILJ<SMSDDALFTYW 



STRSSVSQYNRLDQYE I P^LLMCTYLY IVKMISEDTLLTYWN-KVSPQELI NI L I LLEVCL 
-ALIGSTLRFI?RLDQAETRSLIJ<CI^IMKTISYETLIAYWQ-RAPSPEVSDFFSIlfcvCL 

TFSAESSRSLLICLLWVLKN-ADETVLQKWFTDLSVXQI^RLLDLLTLCV 

MLNADTTRNLKICFLWIMKN-ADQSLIRKWIADLPSTQLNRILDLLFICV 



HQFjQYMGKRY IARKQEGLG- -PIVHDRKS QTLPVSRNRTGMM 

HQFQYMGKRYIAR TGMM 



FHFRYMG KRN IARVH DAW LSKHFGI DRKS QTMPALRKRSGVM 

QNFRYLG KRN I IRK I AAAF — KFVQS TQNN G T LK GSNPS C QTS GLLAQWMHS TSRHEGHK 
SC FE YKGKKVFERMNS LTFK-- KSKDMRAK LEFAILGS I GARQEMV 

LCFEYKGKQSSDKVSTCVLQ--KSRDVKAR LEEALLRGEGARGEMM 



:-LAP L 00 1 3 5 L DNS LT FNHS YGHS DADVLH QS LLEAN I ATE VC 

HARL QQL GSIDNS LT FNHS YGHS DAD VLHQS LLEAN IATEVC 



; A?-. L 0 H L ESLESS FTLNHSSTTT FAD I FH QAL LE GNT AT E V 5 

s c t i ?::?. g k — n al s k ? kl lgml pntm t|s nsnz i d i vhhvdtean i at e c- c 

??.S R GQ l£ RS P S G S A F G S QE NLRWRKDMTHWRQN T EKLDKSRAE I EHEAL I DGNLATEAN 
■vRRAPGNDR FF 3 I^ENLRWKKEQTHWRQANEKXDKTKAELDQEAL I S GNLATEAH 



LTALDT LS LFTIJVF^CLLADHGHNPLMKKVFDVYI^ 

LTALDTLSLFT I AFKN C LLAPH GHNP LMKKV FDVY LC FL QKHQSE TALKNVFT ALRS L : Y 
KLSRGH5PIJ^XK\ r FDVYIXFL0KH0SFJ4ALKNVFTALR5 7 1 7 

I TV L LT I S F FT QC FKT H FLNNDGHNP LMKKV FD I HLAFLKNGQS EVS LKHVFAS LRAF 1 2 
17 I Lr'LVSLFTQTHQRC-OQCrXTQNSIJ^KRGFr^ 

I I • LOT LE I WQTVS - - VTES — KES ILGGVLKVLLHSMACNQSAVYLQHCFATQRALVS 

II ILDMQENI I OAS S - - AL DC - - KDS LLGGVLR VLVNS LNCDQS TTYLTHC FAfrLRAL I A 3 . / 



KFPSTFYEGRADMCAALCYE ILKCCNSKLSS IRTEASQLLYFI>CR>INFDYTGKKSFVRTH 
KFPS T FYE GRADMCAALC YE I LKCCNSKLSS I RTEASQJULYFI>IRNNFDy TGKKS FVRTH 
K FPS T FYEGRADMCAS LC YEVLKCCNSKLSS I RTEASQLLYFLMRNNFDYTGKKSFVRTH 
K FPS AF FKGR VNMCAAFC YEVLKCCTS K I S S T RNEASALL YLLMRNNFE YTKRKTFLRTH 
K FPS A F FOG PADLCGSFCYFA^KCC^R5RSTQTEASALLYLFMRKNFEFNXQKSIVRSH 
J^PELLFEEETECXADLCLRLLRllCSSSIGTIRSHPSASLYTI^C^ 

KFGDLLFEEEVXCXrFDLCHQV^HHCSSSMDVTRSQACATLYLl>lRFSFtiATS — NFARVK 



LQVI ISVSQLIADWGIGETRFQQSLSI INNCANSDRLIK*iTSF^SDVKDLTKJ*IRTVLM 
LQVIISVSQLIADVVGIGGTRFQQSLSIIJWCANSDiaiK^ 

LQVI I S LSQLI ADWG I GGTRFQQSLS I INNCANSDRLI KHTS FSSDVXDLTKRI RTVLM 
LQI I IAVSQLIADVALSGGSRFQESLFI INNFANSDRFMLARAFTAEVKDLTKR I RTVLM 
LQf, I KAVSQL IAD-AG I GGSRFQHSLAI TNNFANGDKQKK^NFPAEVKDLTKRI RTVLM 



B 



Transmembrane 



HC2.- 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



ATACM]<EHENDPE>rLVTlLOYSI-AKS YASTPELRKTWLDSMAR IHVKNGE LSEAAMCYVHV 
AT AQMKE HEND P EML VD LQ Y S LAKS YAS T PE LRKTWLDSMAR I HVKNG IhS E1AAMCYVHV 
ATAQMKEHEKDPEMLVT)1^YSLAKSYASTPELJ^TWLDSMAJ<I HVKNG I LSEAAMC YVHV 
ATAQHKEHEKDPEML I DLQYSIJy<SYASTPELRKTWLDSMAKI HVKNGI FSEAAMC YVHV 
ATAQMKEHEKDPEMLV1DI^YS1JWSYASTPE1J<RTWLESMAKIHARNGI LSE^AMCYIH I 
DTVKMKEHQEDPEMLI DI>TYRIAKGYQTSPDIJU/rWU>N^GKHSERS* HAEAAQCLVHS 
DTVT<MREFQ£DPEMLMDLWYRIAKSYQAS^ YTEAAMCLVHA 



domain 



SH3 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 

HC5 



TALVAEYL TRKGV 

TALVAEY1 TRKEA 

TALVAEY1 THREAD- - 

AALVAEF1 KRKKL 

AALIAEYI KRKGYW 

AALVAEYZSMLED 

AALVAEYI SMLED 



FRQGCTAFRVI TPN 

-VQWEPPLLPHSHSACLRJ^RGGVFRQGCTAFRVI TPN 
-lALQREPPVFPYSHTSCQRK SRGGM FRQGC TAFR V I T PN 
FFNGCSAFKKITPN 



RKYLPVGCVTEC^ISSN 

HS YLPVGSVS FQN ISSN 



1 7 AH 



HC2A 
KIAA 
rat 
HC4 

HC1 
HC3 
HC5 



I DEEASMMEDVCIKQD VTiFNEDVlMELLEQCADGLWKAEF YELIADI YKLI IPI 

I DEEASMMEDVGMQD VHFNEDVI>IELI^QCAJXSLWKAE HYELIADI YKLlj l P I 

I DEEASHMEDVGMQD VH FNE DVLMELLEQCADGLWKAE RLRAG LLTS I NS S S P 

I DEEGAMKEDAGMMD VHYSEEVLLELLEQCVNGLWKAEHYEI 2SEI SKL I GP I 

IKEEGAAKEDSGKHT) TPYNi|n 1 LVEQLYMCGE FLWKSEItEEljiADWKP 1 1AV 

vl.eesavsddwspdeegicsgkyftesglvglleqaaasfsmag^eavne vykvi. ipi 
vleeswsedtlspdeixw'cagoyftesglvglleoaaelfstgglyetvnevEEl3^ p i 



HC2A 

KIAA 

ra t 

HC A 

HCI 

HC3 

HCr 



ITAM ITAM ITAM ITAM 

YEKpfrtD " ~ 

YEKRRDFERLAHIffPTlUlRifeSK^EVMHSGRRLLGT fY 

SMKSGGTLETTHL !f DTI HRF YSK\TEVITR A AGSWDLLPGGLFGQ 

YENRRE FENLTQV YRT 1 HGA YTK3 LEVMH TKKRL LG TFFRVA FY G Q 

FEKQRTJFKKLSDL TYD: HR SiYLKM AEVVNS EKRLFG rfYYRMAFYGQ 

HEANRDAKKLSTIHGKLQEAFSKIVHQ^TGWERMFG 1 YFRVGFYG- 

LEAHRE FRKLT LTH SKLQRAFDS I VNKDH - - KRM FG 4YFRVJG F FG - 



HC2A 
KIAA 
rat 
HC 4 

HC1 
HC3 
KC5 



ITAM ITAM 
-FFEDEDGK^lYKUPKLTPLSEISQRLIJatfSDHFGSE^ 

GFFEDEDGK3 YI YKE PKXTPLSE ISQRLLKI YSDK FGSENVKMIQDSGKVNPKDLDS^ YA 
GFFEDEDGKE YIYKE PKLTPLSE ISQRLLKI YSDf FGSENVKHIQDSGKVNPKDLDS* FA 
5FTEEEDGKI YIYKE PKLTGLSEI SLRLVKI YGEK FGTENVKI IQDSDKVNAKELDPI- YA 
GFFEEEEGKI YIYKE PKLTGLSEI SQRLLK1 YADB FGADNVKI I QDSNKVNFKDLDPKYA 
TKFGDLDEQE FViKE PAITKIJ^I SHItflE^e YGEF FGE D WEV I KD S N PVDKC KL D P NKA 
SKFGDLl^Q qFVYKEy AITKLPEISHRLEA^YG 



HC2A 

KIAA 

rat 

HC4 

HC1 

HC3 



ITAM 

YTbVTHVIPFFDEKELQERKTEFERSHNIRRFMFEMPFT 
YI }VTHVIPFFTIEKELQERKTEFERSHNIRRFMFEMPFTQTGKRQGGV^ 
Y I 2VTHVT P FFDEKE LQERKTE FERCHN I RR FM FEMP FTQTGKRQGGVEEQCKRRT I LT A 
HI 2VTYVKPYFDDKELTERKTEFERNHNISRFVFEAPYTLSGKKQGCIEEQCKRRTI LTT 
|YIQ\rtTYVTPFFEEKE lEDRKTDFEMHHNINRFVFETPFTI^GKKHGGVAEQCKRRT I LTfr 
I Y I Q rJTYVEPY FDTYEMKDR I TY FDKNYNLRRFMYCTP FTLDGRAHGELHEQFKRKT T LTT 
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Genomic BAC#19 
DNA DNA 



-111 

CGGTAACCGCCATTTTGTCTCCTGTAACAATTTACGCGCCGTGTAACTGTGAATCTTTCAAAGCCTCAGTTTTATGACC 
CTGTGGAGCCAGTGGACTTTGAAGGACTTCTG - 1 



1/1 31/11 

ATG ACA CAC CTG AAC AGC CTG GAT GTG CAG CTT GCC HAG GAG CTC GGG GAC TTC ACT GAT 

Met thr his leu asn ser leu asp val gin leu ala gin glu leu gly asp phe thr asp 
61/21 91/31 

GAC GAC TTG GAC GTG GTG TTC ACG CCA AAG GAA TGT AGG ACT TTG CAG CCC TCT TTG CCG 

asp asp leu asp val val phe thr pro lys glu cys arg thr leu gin pro ser leu pro 
121/41 151/51 

GAG GAA GGG GTT GAA CTG GAC CCT CAT GTC AGG GAC TGT GTT CAG ACC TAC ATC CGT GAG 

glu glu gly val glu leu asp pro his val arg asp cys val gin thr tyr ile arg glu 
181/61 211/71 

TGG CTA ATC GTG AAC CGG AAA AAC CAA GGA AGT CCA GAA ATC TGT GGC TTT AAA AAG ACT 

trp leu ile val asn arg lys asn gin gly ser pro glu ile cys gly phe lys lys thr 
241/81 271/91 

GGA TCT CGA AAA GAT TTT CAC AAG ACG CTT CCG AAA CAG ACG TTT GAG TCG GAA ACC TTG 

gly ser arg lys asp phe his lys thr leu pro lys gin thr phe glu ser glu thr leu 
301/ 101 3 31/111 

GAG TGC AGT GAA CCC GCT GCT CAG GCA GGC CCC CGC CAC TTA AAC GTG CTG TGC GAC GTG 

glu cys ser glu pro ala ala gin ala gly pro arg his leu asn val leu cys asp val 
361/121 391/131 

TCT GGG AAA GGC CCC GTC ACT GCC TGT GAC TTT GAC CTC CGC AGC CTG CAG CCT GAC AAG 

ser gly lys gly pro val thr ala cys asp phe asp leu arg ser leu gin pro asp lys 
421/141 451/151 

CGG CTA GAA AAC CTC CTG CAG CAA GTG AGT GCC GAG GAC TTT GAG AAG CAG AAC GAG GAG 

arg leu glu asn leu leu gin gin val ser ala glu asp phe glu lys gin asn glu glu 
481/161 511/171 

GCC CGG AGG ACC AAC AGG CAG GCC GAG CTC TTT GCC TTT TAC CCA TCA GTG GAC GAG GAG 

ala arg arg thr asn arg gin ala glu leu phe ala leu tyr pro ser val asp glu glu 
541/181 571/191 

GAT GCT GTG GAA ATA CGT CCA GTA CCA GAA TGT ZCC .AAG GAA CAC CTG GGC AAC AGA ATA 

asp ala val glu Lie arg pro val pro glu cys pro lys glu his leu gly asn arg ile 
601/201 631/211 

TTG GTC AAG TTG CTG ACC TTG AAG TTC GAG ATT GAA ATT GAG CCC CTG TTT GCC AGC ATT 

leu val lys leu leu thr leu lys phe glu :le glu ile glu pro leu phe ala ser ile 
661/221 691/231 

GCC CTC TAC GAT GTT AAA GAA AGG AAA AAG ATC TCA GAA AAT TTT CAC TGT GAC CTG AAC 

ala leu tyr asp val lys glu arg lys lys ile ser glu asn phe his cys asp leu asn 
721/241 751/251 

TCT GAC CAG TTT .AAA GGA TTT CTG CGA GCT CAC AC G GCT TCA GTG GCC GCA TCA AGT CAG 

ser isp gin phe lys gly phe leu arg ala his thr pro ser val ala ala ser ser gin 
781/261 311/271 

GCG AGA TCT GCA GTC TTC TCA GTC AC Z TAC CCG TC G TCA GAC ATC TAC CTG GTA GTC AAG 

ala arg ser ala val phe ser val thr tyr pro ser ser asp ile tyr leu val val lys 
841/281 871/291 

ATT GAA AAA GTC CTG CAG CAG GGA GAT ATT GGA GAC TGT GCA GAG CCC TAC ACG GTT AT Z 

ile glu lys val leu gin gin gly asp ile gly asp cys ala glu pro tyr thr val ile 
901/301 931/311 

AAA GAA AGT GAT GGT GGA AAG AGT AAA GAA AAG ATT GAA AAA CTA AAA CTC CAA GCT GAA 

lys glu ser asp gly gly lys ser lys qlu lys ile alu lys leu lys leu air. ala 



\ 




1021/341 

TCA AGC TTC TTC AAT GTC TCC ACC CTT GAG 
ser ser phe phe asn val ser thr leu glu 
1081/361 

GGG AGA AGC CCA GTG GGT GAA CGG AGG ACA 
gly arg ser pro val gly glu arg arg thr 
1141/381 

GCC CTC TCC TTG GAG GAA AAT GGG GTT GGA 
ala leu ser leu glu glu asn gly val gly 
1201/401 

AGC AGC TTT TTC AAG CAG GAA GGA GAT CGC 
ser ser phe phe lys gin glu gly asp arg 
1261/421 

GCT GAC TAC AAA AGA TCA TCA TCC TTA CAG 
ala asp tyr lys arg ser ser ser leu gin 
1321/441 

AGA CTG GAG ATT TCT ACA GCT CCA GAG ATC 
arg leu glu ile ser thr ala pro glu ile 
1381/461 

CCC GTG AAA CCC TTT CCT GAA AAC CGG ACA 
pro val lys pro phe pro glu asn arg thr 
1441/481 

ACA CGA GAA GTA TAT GTC CCT CAC ACT GTG 
thr arg glu val tyr val pro his thr val 
1501/501 

AGG CTG AAC TTT GTA AAC AAA CTA GCA TCA 
arg leu asn phe val asn lys leu ala ser 
1561/521 

ATG TGT GGA GAA GAT GCT AGC AAT GCG ATG 
met cys gly glu asp ala ser asn ala met 
1621/54 1 

GAA TTT CTG CAG GAA GTG TAC ACA GCT GTT 
glu phe leu gin glu val tyr thr ala val 
1681/561 

GAA GAA GTG AAA ATT AAG CTC CCC GCT AAG 
glu glu val lys ile lys leu pro ala lys 
1741/581 

TTC TAC CAT ATC AGC TGT CAG CAG AAG CAA 
phe tyr his ile ser cys gin gin lys gin 
1801/601 

TCA TGG CTG CCA ATT CTC TTA AAT GAA CGT 
ser trp leu pro ile leu leu asn glu arg 

1861/621 

GCC TTG GAA AAA TTG CCA CCC AAC TAC TCC 
ala leu glu lys leu pro pro asn tyr ser 
1921/641 

AAT CCT 2CC ATT AAG TGG GCT GAA GGA CAT 
asn pro pro ile lys trp ala glu gly his 
1981/661 

GTT TCT TCT GTA CAC ACC CAG GAC AAC CAC 
val ser ser val his thr gin asp asn his 
2041/681 

CTG GAG AGC CAG GTG ACC TTC CCC ATC CGC 




1051/351 

AGG GAG GTA ACT GAT GTG GAC TCT GTG GTT 
arg glu val thr asp val asp ser val val 
1111/371 

TTG GCC CAA TCT AGA AGG CTT TCT GAA AGA 
leu ala gin ser arg arg leu ser glu arg 
1171/391 

TCC AAC TTC AAA ACC TCC ACT CTG AGC GTT 
ser asn phe lys thr ser thr leu ser val 
1231/411 

CTT AGC GAT GAA GAC TTA TTC AAG TTT TTA 
leu ser asp glu asp leu phe lys phe leu 
1291/431 

AGA CGA GTC AAG TCA ATT CCA GGC TTG CTA 
arg arg val lys ser ile pro gly leu leu 
1351/451 

ATC AAT TGC TGT CTG ACT CCT GAA ATG CTG 
ile asn cys cys leu thr pro glu met leu 
1411/471 

CGC CCG CAC AAA GAG ATT TTG GAA TTT CCA 
arg pro his lys glu ile leu glu phe pro 
1471/491 

TAC AGA AAC CTT CTC TAT GTC TAC CCA CAG 
tyr arg asn leu leu tyr val tyr pro gin 
1531/511 

GCC CGG AAC ATT ACA ATA AAG ATC CAG TTT 
ala arg asn ile thr ile lys ile gin phe 
1591/531 

CCG GTC ATC TTT GGA AAA TCC AGC GGG CCT 
pro val ile phe gly lys ser ser gly pro 
1651/551 

ACA TAC CAT AAT AAG TCT CCT GAC TTT TAT 
thr tyr his asn lys ser pro asp phe tyr 
1711/571 

CTC ACA GTA AAT CAC CAC CTC CTG TTC ACC 
leu thr val asn his his leu leu phe thr 
1771/591 

GGA GCC TCC GTG GAA ACT CTC CTG GGA TAT 
gly ala ser val glu thr leu leu gly tyr 
1831/611 

CTT CAA ACT GGA TCC TAC TGT CTC CCA GTT 
leu gin thr gly ser tyr cys leu pro val 

1891/631 

ATG CAT TCT GCT GAG AAA GTC CCA TTA CAG 
met his ser ala glu lys val pro leu gin 
1 951/651 

AAG GGA GTA TTT AAT ATT GAA GTG CAA GCT 
lys gly val phe asn ile glu val gin ala 
2011/671 

CTG GAG AAG TTC TTC ACC CTC TGC CAC TCC 
leu glu lys phe phe thr leu cys his ser 

2071/691 

GTG CTG GAT CAG AAA ATC AGC GAG ATG GCG 
v = ' ] a ■ * _i 5^ ~ii r. 1 v«? i 1 <=» spr a 1 u rr.e* * : a 
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2161/721 

GTG CTC TTC CTG CAC CTG GTG CTG GAC AAG 
val leu phe leu his leu val leu asp lys 
2221/741 

ATC GCT GGC CAG ACA GCC AAC TTC TCC CAG 
lie ala gly gin thr ala asn phe ser gin 
2281/761 

AAC AGT CTG CAC AAC AGC AAG GAC CTG AGC 
asn ser leu his asn ser lys asp leu ser 
2341/781 

GCT TCC TAC GTG CAC TAC GTC TTC CGC CTG 
ala ser tyr val his tyr val phe arg leu 
2401/801 

GGC GCT CCC ACT GCC CTC CTA GAC CCT CGG 
gly ala pro thr ala leu leu asp pro arg 
2461/821 

GCT GCT GTG AGT TCA AAG CTG CTG CAG GCC 
ala ala val ser ser lys leu leu gin ala 
2521/841 

GCG GGG ACA CAC TCC GCA GCA GAC GAG GAA 
ala gly thr hi* ser ala ala asp qlu qlu 
2581/861 

GAT CGC AAC TGC AGC CGA ATG TCT TAC TAT 
asp arg asn cys ser arg met ser tyr tyr 
2641/881 

CCT GCA GCC CCA AGG CCA GCC AGC AAA AAG 
pro ala ala pro arg pro ala ser lys lys 
2701/901 

GTG GTC AGC ACC GGA ATG GTG AAA AGC ATG 
val val ser thr gly met val lys ser met 
2761/921 

GAC AGT TTT CGG AGG ACT CGT TTT TCT GAC 
asp ser phe arg arg thr arg phe ser asp 
2821/941 

AAT GTG GTC ACC TCG GAA ATT GCA GCC CTT 
asn val val thr ser glu ile ala ala leu 
2881/961 

GCG GAA AAG ATG AAC ATC AGC CTG GCT TTC 
ala glu lys met asn ile ser leu ala phe 
2941/981 

CGG GGC TTT GTG TTT AAC CTC ATC AGA CAT 
arg gly phe val phe asn leu ile arg his 
3001/1001 

AAC CTT CCA ACG CTC ATT TCC ATG AGG CTA 
asn leu pro thr leu ile ser met arg leu 
3061/1021 

CAT TAC CTC AAT CTG AAC CTT TTT TTT ATG 
his tyr leu asn leu asn leu phe phe met 
3121/1041 

CCT TCC ATA TCT TCC CAG AAC TCA AGC TCC 
pro ser ile ser ser gin asn ser ser ser 
3181/1061 

AGC ATG TTC GAT CTG ACT TCC GAG TAC CGC 




2191/731 

CTC TTC CAG CTG TCC GTG CAG CCC ATG GTC 
leu phe gin leu ser val gin pro met val 
2251/751 

TTT GCC TTC GAG TCC GTG GTG GCC ATC GCC 
phe ala phe glu ser val val ala ile ala 
2311/771 

AAG GAC CAG CAT GGG AGG AAC TGC CTG CTG 
lys asp gin his gly arg asn cys leu leu 
2371/791 

CCA GAG GTG CAA AGG GAT GTG CCC AAG TCA 
pro glu val gin arg asp val pro lys ser 
2431/811 

AGC TAC CAC ACG TAT GGC CGC ACA TCA GCT 
ser tyr his thr tyr gly arg thr ser ala 
2491/831 

CGG GTG ATG AGC AGC AGT AAC CCA GAC CTC 
arg val met ser ser ser asn pro asp leu 
2551/851 

GTG AAG AAC ATC ATG TCT TCA AAG ATC GCC 
val lys asn ile met ser ser lys ile ala 
2611/871 

TGC TCT GGC AGT AGT GAT GCT CCA AGT TCA 
cys ser gly ser ser asp ala pro ser ser 
2671/891 

CAT TTC CAT GAG GAG CTT GCC CTT CAG ATG 
his phe his glu glu leu ala leu gin met 
2731/911 

GCC CAG CAC GTA CAT AAC ATG GAC AAA CGG 
ala gin his val his asn met asp lys arg 
2791/931 

CGT TTC ATG GAT GAC ATA ACT ACT ATT GTT 
arg phe met asp asp ile thr thr lie val 
2851/951 

TTA GTA AAA CCA CAG AAG GAA AAT GAA CAG 
leu val lys pro gin lys glu asn glu gin 
2911/971 

TTC TTG TAT GAC CTT CTC TCC CTC ATG GAT 
phe leu tyr asp leu leu ser Leu met asp 
2971/991 

TAT TGC AGC CAG CTG TCA GCC AAG CTC AGT 
tyr cys ser gin leu ser ala lys leu ser 

3031/1011 

^AG TTC CTG AGA ATC CTC TGT AGC CAT GAG 
glu phe leu arg lie leu cys ser his glu 
3091/1031 

AAT GCT GAT ACT GCT CCA ACA TCT CCT TGT 
asn ala asp thr ala pro thr ser pro cys 
3151/1051 

TGC TCC AGC TTC CAG GAC CAG AAG ATC GCC 
cys ser ser phe gin asp gin lys ile ala 
3211/107 1 

CAG CAG CAC TTC CTC ACC GGG CTC CTC TTC 



3301/1101 

GCT GTC AGT GCA ATT 

ala val ser ala ile 
3361/1121 

CCA GAG GTG AAG GTC 
pro glu val lys val 
3421/1141 

GCT TTG CCA CAG CTC 
ala leu pro gin leu 
3481/1161 

TCG GAT GAA GAA CAA 
ser asp glu glu gin 
3541/1181 

GGG AAT AAT TTC AAT 
gly asn asn phe asn 
3601/1201 

TAC AAC ATG CTG AAC 
tyr asn met leu asn 
3661/1221 

AAA AAT GCT GAT CAG 
lys asn ala asp gin 
3 /21 /124 1 

AAC AGG ATT TTA GAT 
asn arg ile leu asp 
3781/1261 

AGT TCT GAC AAA GTC 
ser ser asp lys val 
3841/1281 

GAA GAG GCT TTG CTG 
glu glu ala leu leu 

3901/1301 

GGG AAC GAC CGA TTT 
gly asn asp arg phe 
3961/1321 

TGG CGG CAA GCT AAT 
trp arg gin ala asn 
4021/1341 

ATC AGT GGC AAT CTG 
ile ser gly asn leu 
4081/1361 

ATC CAG GCG AGC TCG 
ile gin ala ser ser 

4141/1. 3 81 

CTj GTj AAT TCT CTG 
lea val asn ser leu 
42 01/1401 

CTC CGT GCT CTC ATC 
leu arg aia leu ile 
4261/1421 

TTC GAC CTA TGT CAC 
phe asp leu cys his 
4321/1441 

CAA GCC TGT GCC ACC 
gin ala cys ala thr 




CAC AGC CTG CTA AGT 

his ser leu leu ser 

AAA ATC GCC GCC CTT 
lys ile ala ala leu 

TGT GAC TTT ACA GTT 
cys asp phe thr val 

GAA GGA GCC GGT GCC 
glu gly ala gly ala 

TTG AAA ACA AGT GGA 
leu lys thr ser gly 

GCG GAC ACT ACT CGC 
ala asp thr thr arg 

AGC CTC ATT AGG AAG 
ser leu lie arg lys 

CTA CTT TTC ATC TGT 
leu leu phe ile cys 

AGT ACC CAA GTC CTG 
ser thr gin val leu 

CGT GGG GAA GGG GCC 
arg gly glu gly ala 

CCA GGC CTA AAT GAA 
pro gly leu asn glu 

GAG AAG CTA GAT AAA 
glu lys leu asp lys 

GCT ACA 3AA GCA CAT 
ala thr glu ala his 

GCT CTG GAC TGT AAA 
ala leu asp cys lys 

AAC TGT GAT CAG AGT 
asn cys asp gin ser 

jCC AAG TTT GGA GAC 
ala lys phe gly asp 

CAA GTC CTG CAC CAC 
gin val leu his his 

2TT TAC CTC 2TC ATG 
leu tyr leu leu met 




3331/1111 

TCT CAC GAC CTG GAC 
ser his asp leu asp 
3391/1131 

TAC CTA CCT TTA GTT 
tyr leu pro leu val 
3451/1151 

GCA GAT ACT CGC AGA 
ala asp thr arg arg 
3511/1171 

ATT AAC CAG AAT GTG 
ile asn gin asn val 
3571/1191 

ATA GTG CTG TCT TCC 
ile val leu ser ser 
3631/1211 

AAC CTC ATG ATC TGC 
asn leu met ile cys 
3691/1231 

TGG ATT GCT GAC CTG 
trp ile ala asp leu 
J /bl/1251 

GTG TTA TGT TTT GAG 
val leu cys phe glu 
3811/1271 

CAG AAG TCA AGG GAT 
gin lys ser arg asp 
3871/1291 

AGA GGG GAG ATG ATG 
arg gly glu met met 
3931/1311 

AAT TTG AGA TGG AAG 
asn leu arg trp lys 
3991/1331 

ACA AAG GCC GAG TTA 
thr lys ala glu leu 
4051/1351 

TTA ATC ATC CTG GAT 
leu ile ile leu asp 
4111/1371 

GAC AGC CTG CTG GGA 
asp ser leu leu gly 

4 171/1391 

ACC ACC TAC CTG ACT 
thr thr tyr leu thr 
4231/1411 

TTA CTC TTC GAA GAG 
leu leu phe glu glu 
4291/1431 

TGC AGC AGC AGC ATG 
cys ser ser ser met 
4351/1451 

AGG TTC AGT TTT GGA 

arg phe ser phe aly 




CCA CGC TGT GTC AAA 

pro arg cys val lys 

GGC ATC ATT TTG GAT 
gly ile ile leu asp 

TAC CGC ACC AGT GGC 
tyr arg thr ser gly 

GCT CTG GCC ATA GCA 
ala leu ala ile ala 

TTG CCC TAT AAG CAG 
leu pro tyr lys gin 

TTC CTC TGG ATC ATG 
phe leu trp ile met 

CCA TCA ACG CAG CTC 
pro ser thr gin leu 

TAT AAG GGA AAA CAG 
tyr lys gly lys gin 

GTC AAG GCC CGG CTG 
val lys ala arg leu 

CGC CGC CGG GCT CCA 
arg arg arg ala pro 

AAA GAG CAG ACA CAT 
lys glu gin thr his 

GAT CAA GAA GCC TTG 
asp gin glu ala leu 

ATG CAG GAA AAC ATT 
met gin glu asn ile 

GGT GTT CTG AGG GTG 
gly val leu arg val 

CAC TGC TTT GCA ACA 
his cys phe ala thr 

GAG GTG GAA 2AG TGT 
glu val glu gin cys 

GAT GTC ACC CGG AGC 
asp val thr arg ser 

GCC ACC AGT AAT TTT 

ala thr ser asn phe 



4441/1481 

TTT AAT GAA GAG CAC 
phe asn glu glu his 
4501/1501 

ACA GCC ATG CAG ATG 
thr ala met gin met 
4561/1521 

AGC ATC TTA TAT GAC 
ser ile leu tyr asp 
4621/1541 

GAT CTC ATG TAC AGA 
asp leu met tyr arg 
4681/1561 

CTC CAG AAC ATG GCA 
leu gin asn met ala 
4741/1581 

CTG GTG CAC GCC GCT 
leu val his ala ala 
4801/1601 

CTG CCC GTG GGC AGT 
leu pro val gly ser 
4861/1621 

GTC TCT GAG GAC ACC 
val ser glu asp thr 
4921/1641 

GAG AGT GGC CTG GTA 
glu ser gly leu val 
4961/1661 

TAT GAG ACA GTT AAT 
tyr glu thr val asn 
5041/1681 

TTC CGG AAG CTG ACA 
phe arg lys leu thr 
5101/1701 

AAG GAT CAT AAG AGA 
lys asp his lys arg 
5161/1721 

GGG GAT TTG GAT GAA 
gly asp leu asp glu 
5221/1741 

ATC TCA CAT AGA CTA 
ile ser his arg Leu 
5231/1761 

ATT AAA GAC TCC ACT 
ile lys asp ser thr 
534 1/1781 

ATC ACT TTT GTG GAG 
ile thr phe val glu 
5401/1801 

GAG AAG AAT TTC AAC 
glu lys asn phe asn 
5461/1821 

CCT CGG GGA GAG CTG 




CTG AGA AGA TCC TTG 
leu arg arg ser leu 

ACT CCT TTT CCC ACC 
thr pro phe pro thr 

ACA GTG AAA ATG AGG 
thr val lys met arg 

ATT GCC AAG AGT TAC 
ile ala lys ser tyr 

GAG AAA CAC ACC AAG 
glu lys his thr lys 

GCG TTA GTG GCT GAG 
ala leu val ala glu 

GTC AGC TTC CAG AAT 
val ser phe gin asn 

CTG TCA CCT GAC GAG 
leu ser pro asp glu 

GGC CTC CTG GAG CAG 
gly leu leu glu gin 

GAG GTC TAC AAG CTG 
glu val tyr lys leu 

CTC ACT CAC AGC AAG 
Leu thr his ser lys 

ATG TTT GGA ACC TAC 
met phe gly thr tyr 

CAG GAG TTT GTC TAC 
gin glu phe val tyr 

GAG GCA TTT TAT GGT 
glu ala phe tyr gly 

"CT GTG GAC AAA ACC 
pro val asp lys thr 

ZCC TAC TTT GAT GAG 
prD tyr phe asp glu 

CTC CGG AGG TTC ATG 
leu arg arg phe met 

CAT CAG CAG TAC AGA 




4471/1491 

AGG ACA ATT TTG GCC 
arg thr ile leu ala 
4531/1511 

CAG GTG GAG GAA CTT 
gin val glu glu leu 
4591/1531 

GAA TTT CAG GAA GAT 
glu phe gin glu asp 
4651/1551 

CAG GCA TCT CCT GAT 
gin ala ser pro asp 
4711/1571 

AAG AAG TGC TAC ACG 
lys lys cys tyr thr 
4771/1591 

TAT CTG AGC ATG CTG 
tyr leu ser met leu 
4831/1611 

ATT TCT TCC AAT GTG 
ile ser ser asn val 
4891/1631 

GAT GGG GTG TGC GCA 
asp gly val cys ala 
4951/1651 

GCC GCG GAG CTC TTC 
ala ala glu leu phe 
5011/1671 

GTC ATC CCC ATC CTA 
val ile pro ile leu 

5071/1691 

CTG CAG AGA GCC TTC 
leu gin arg ala phe 
5131/1711 

TTC CGA GTT GGT TTC 
phe arg val gly phe 
5191/1731 

AAA GAG CCT GCA ATT 
lys alu pro ala ile 
5251/1751 

CAA TGT TTT GGT GCA 
gin cys phe gly ala 

5 311/17 7 1 

AAG TTG GAT CCT AAC 
lys ,.eu asp pro 3sn 
5 371/1791 

TAT GAG ATG AAA CAC 
tyr glu met lys asp 
5431 / 1811 

TAC ACC ACC CCG TTC 
tyr thr thr pro phe 
5491/1331 

AGG AAC ACA GTC CTG 
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TAT TCA GAA GAG GAC 
tyr ser glu glu asp 

CTC TGT AAT CTG AAT 
leu cys asn leu asn 

CCT GAG ATG CTT ATG 
pro glu met leu met 

CTG CGG CTG ACC TGG 
leu arg leu thr trp 

GAG GCT GCC ATG TGC 
glu ala ala met cys 

GAG GAC CAC AGC TAC 
glu asp his ser tyr 

CTG GAG GAG TCT GTG 
leu glu glu ser val 

GGC CAG TAC TTC ACC 
gly gin tyr phe thr 

AGC ACG GGA GGC TTA 
ser thr gly gly leu 

GAA GCG CAT CGA GAA 
glu ala his arg glu 

GAC AGC ATC GTT AAC 
asp ser ile val asn 

TTT GGA TCC AAA TTT 
phe gly ser lys phe 

ACC AAG CTT CCT GAG 
thr lys leu pro glu 

GAA TTT GTG GAA GTG 
glu phe val glu val 

AAG GCC TAC ATA CAG 
lys ala tyr lie gin 

AGG GTC ACA TAC TTT 
arg val thr tyr phe 

ACC CTG GAG GGG CGG 
thr leu glu gly arg 

ACC ACT ATG CAC GCC 
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5611/1871 

AAG ACC CTG CAG TTA GCA GTT GCC ATT AAC 

lys thr leu gin leu ala val ala ile asn 
5671/1891 

ATG GTG CTG CAA GGC TCT GTG GGA GCT ACT 
met val leu gin gly ser val gly ala thr 
5731/1911 

GTG TTT TTG GCT GAA ATT CCT GCT GAT CCA 
val phe leu ala glu ile pro ala asp pro 
5791/1931 

TTA TGC TTT AAG GAA TTC ATC ATG AGA TGT 
leu cys phe lys glu phe ile met arg cys 
5851/1951 

ATC ACG GCA GAC CAG AGG GAA TAT CAG CAG 
ile thr ala asp gin arg glu tyr gin gin 
5911/1971 

GAG AAC CTC AGG CCA ATG ATC GAG CGG AAA 
glu asn leu arg pro met ile glu arg lys 
5971/1991 

GTT GAG AGT CAA AAG AGG GAC TCC TTC CAC 
val glu ser gin lys arg asp ser phe his 
6031 /2C11 



GAAAAGCCATCTTCATTCGTGGAGACTGTGGCCCTGCAACCCTGGAGAAGGACTTGCTGGTACTTAAAAAATGGGACATT 
TGCCACCCAGGACTGACTGTACACTCCCTGATCAGCCAGCACTCTGGAAGCTTTGGGATCCCAGGAACCATGGAATTATT 
CCCAAATGGACTCTGACCAGATTTTTGCCATACTGGGGGGTGGCGGGATGGAGGATGGGTACTCAGGCATGACTGCGTAT 
TTATTAAAGTGTGTTTTTCCACAATGTACCAAACAAGGCATAAGCAGCTTCTCCTGCTGACTGGCCAATCACTGCCCATC 
TrACAGATGATTTCCTCTGGCCCATATTTGAATTTATTGGAGTAACTCAAATTGCCTGAGGAAAAATGGAAAAATTATCC 
ArrAGT^GATTCAAACTGAATTTCACTCTTTATAGGAAGGCAGGGCAAACTTGTAGGAGTACGAAACATTTTCAATAAAT 
rTACAAAGGGAAGCCTTACTACAATTCCAAAAATCATCATGGTTGGAAATTTGGGAGGAGATTATTTGTGAACTTGTTAC 
CCTTTTGGTAATGGTGGACTAATTGCTGTATAGTTATTTTTGTTTTATTATTACTGTTACATTAATTTAACATGCATTTA 
TAGAAGAATACATTCAAAGCACTGATGTAGGAGATACACGGTACTTGGAGCAGTCAGCCAAAAATCACAGATACTGCTTT 
CACTTAAATGGAAACAATTCTCCGATAATGCTTTGCTTTTTTTCTTATGTCACTCTTGTGTACTATCTATTTTTCTCCTC 
TCTGGGACCAAGTTTCTTTTTATAAAGCAATAATATCTCTGTTTTCATTTCAGAACATTGTGCTGTCTGTCAGCATATGT 
AT ATT AGrTACAAAATATATTCAACTTTGACTTCTTTTGACAAAGGACTTTAGGAAAAGGAGGAACAAAGACATTATTTG 
AGAATTAAATTATATATTTTTAATATGACTGTGACCTTGACTGATAATAAAGATGTAATAAGAATTGCAAGCTAAAAAAA 

AAAAAAAAAAAACTCG 



5581/1861 














AAG 


ATT GAA GTT 


GCC 


ATT 




GAC 


ATG 


AAG 


ile glu val 


aia 


lie 


glu 


asp 


met 


lys 


lys 


5641/1881 














CAG 


CAG GAG CCG 


CCT 


GAT 


GCA 




ATG 


CTT 


gin glu pro 


pro 


asp 


ala 


lys 


met 


leu 


gin 


5701/1901 
















GTA AAT CAG 


GGA 


CCA 


CTC, 


GAA 


GTA 


GCC 


CAA 


val asn gin 


gly 


pro 


leu 


glu 


va 1 


ala 


gin 


5761/1921 
















AAA CTC TAT 


CGA 


CAT 


CT\C 


AAC 


AAG 


TTG 


AGG 


lys leu tyr 


arg 


Hie: 


his 


asn 


1 V*5 

j 


leu 


arg 


5821/1941 
















GGT GAA GCT 


GTA 


GAG 


AAA 


AAC 


AAG 


CCT 


CTC 
Lit 


gly glu ala 


val 


glu 


lys 


asn 


lys 


arg 


leu 


5881/1961 
















GAA CTC AAA 


AAG 


AAC 


TAT 


AAC 


AAG 


CT A 


a a a 


glu leu lys 


lys 




t vr 


asn 


1 ys 


leu 


lys 


5941/1981 
















ATT CCA GAA 


CTG 


TAC 


AAG 


CCA 


ATA 


TTC 


AGA 


ile pro glu 


leu 


tyr 


lys 


pro 


ile 


phe 


arg 


6001/2001 
















AGA TCT AGT 


TTC 


AGG 


AAA 


TGT 


GAA 


ACC 


CAG 


arg ser ser 


phe 


arg 


lys 


cys 


glu 


thr 


gin 




A. Allelic variations: single nucleotide changes (polymorphism) between CLASP-5 cDNA 
isoforms _ 



Isoform 


Nucleotide(s) 


Consequence 


1 


1727 


C to T change; mis-sense 






mutation changing codon from 






alanine to valine 


2 


1749 


A to G change; silent mutation 


3 


2277 


G to C change; silent mutation 


4 


2853 


C to T change; silent mutation 


5 


3427 


A to G change; mis-sense 






mutation changing codon from 






lysine to glutamic acid 


6 


3777 


C to T change; silent mutation 



B. Alternative splices _ 

Isoform Difference Nucleotide(s) Consequence 

1 exon deletion 1 806- 1 944 premature, in-frame stop codon 

leading to the production of a 
truncated, most likely soluble 
protein 

2 exon insertion between 2857 and additional, in-frame 48 nucleotide 

2858 exon that contains a stop codon at 
the second codon, which would 
lead to a truncated, most likely 
soluble protein 



B 



CD 3 
CO C 
— - O 
CD 

31 O 

en* 5: 

5 CD 

CD 33 

CD C 

> 3 

cr 

CD 

3* 
CQ 



cn 
cn 

N3 



00 
o 
00 



o 
< 

CD 



CO 
CO 
CO 
CD 




l a partial exon (nucleotides 3793 to 3952) 

CC A GCTGTC A GCC A AGCTC AGT A ACCTTCC AACGCTC ATTTCC ATG AGGCTAG 
AC.TTCCTGAC.AATCCTCTGTAGCCATGAGCATTACCTCAATC TOAACCTTTTT 
TTTATGAATnCTGATACTOCTCCAACATCTCCTTGTCCTTCCAT ATCTTCCCAG 
GTAATAAAAGAATTATTTAACTAAAAGAATTATTCAAGCTAT 

2 nd exon (nucleotides 5809 to 5948) 

GCTCATAA a ATr.C.CTCCTTACC.TTTCTGTAG AACTCAAGCTC CTGCTCCAGCT 
TCCAGGACCAGAAGATCGCCAGCATGTTCGATCTGACTTCCG AGTACCGCCA 
GC ACtCACTTCCTCACCGCtGCTCCTCTTCACAGAACTOGCTGC TGCCCTGGATG 
rrOAAGGGGAAGG GTATGTTTCTGGCATTTAAAATGGAAGATGAAGC 

3 rd exon (nucleotides 13662 to 13831) 

CAT.^CCTCTT^ATTrrTGTGTTGTr.rrAACAG AATCAGC AAAGTACAAAGG 
AAAGCTGTCAGTGCAATTCACAGCCTGCTAAGTTCTCACGACCTGGACCCAC 
GrTGTGTC A A ACCAGAGGTGAAGGTCAAAATCGCCGCCCTT TACCTACCTTTA 
GTTGGCATrATTTTGGATGCTTTGCCACAGCTCTGTGACTTTACAG GTAATGG 
CCCTTCTGTTTTCTTTCTTGGATTG 

4 th exon (nucleotides 16948 to 17087) 

TCTTTG ACTTG A A ' rr ' a f a a a fG a TGTTTTC A TTGr 1 A GTTGC AG AT ACTCGC A 
CiATACCGCACCAGTGGCTCGGATCiAAOAACAAGAAGGAG CCGGTGCCATTA 
ArCAGAATGTGGCTCTGGCCATAGCAGGGAATAATTTCA ATTTGAAAACAAG 
TGGAATAGTGrTGTCTTCCTTG GTATGTTGGTGCACATGTGTCTGGTTGATTTT 

TCAT 



5 th exon (nucleotides 19281 to 19463) 

jQQCCTC r /iTrrrrr ^ ATCTCiCCTCCCTTCAG CCCTATAAGCAGTACAACATG 
rTOAACGCGGACACTACTCGCAACCTCATGATCTGCTTCCTCTGGATCATGAA 
AAATGCTGATCAGAGCCTCATTAGGAAGTGGATTGCTGACC TGCCATCAACG 
C AGCTC AACAGOATTTTAGATCTACTTTTCATCTGTGTGTTATGTTTTGAGTAT 
AAG GTAAGTCTGGAGTGGCACAACTTTATACCAGC 



6 th exon (nucleotides 19829 to 19958) 

CACCL^OG^^AT^TrrTrrTArrTrTrTTrTTnTrrAfinriAAAACAGAGTTCT 
GACA AAGTCAOTACCCAAGTCCTGCAGAAGTCAAGGGATGTCAAGGCCCGG 
r-TOG A AC, AGGCTTTGCTGCGTGGGGAAGGGnCCAGAGGGGAGATGATGCGC 



V 



7 th exon (nucleotides 20928 to 21015) 

TC AAATP^r A Tr A Tnr A TTT^TT AAfTPfTA G OG A ACG ACCG ATTTCC AGGC 
rTAAATGAAAATTTGAGATGGAAGAAAGAGCAGACACATTGGCGGCAAGCT 
AATGAGAAGCTAGATAA GTGAGTCACTCGGCAACTTTCTGCTACTTTTACCT 

8 th exon (nucleotides 25765 to 25861 ) 

GCTTTAAT^^ArrTrTTGTTGTTTGCTAG AACAAAGGCCGAGTTAGATCAAG 
AAGCCTTGATCAGTGGCAATCTGGCTACAGAAGCACATTTA ATCATCCTGGA 
TATGCAGOAAAACATTATCCAG GTGAGGAAAACAAACACCCAATCTGATTTG 

9 th exon (nucleotides 27242 to 27376) 

GGATTC A atp, ATnrTGTTCTTCCATTCCCCCAG GCGAGCTr .GGCTCTGGACTG 
TAAAGACAGCCTGCTGGGAGGTGTTCTGAGGGTGCTGGTGAATTCTCTGAAC 
TGTGATCAGAGTACCACCTACCTGACTCACTGCTTTGCAACACTCCGTGCTCT 
CATCGCCAAG GTAAACTTGGGATGCTTGTTTTCTTCCTCTTAATT 

i nth / i +IA~<* OO^O ~)StH1A\ 

AGTGATG r>r " rA ATGGrrrTTTATGTGTGTrCTAG TTTGGAGACTTACTCTTCG 

AAGAGGAOOTGGAACAGTGTTTCGACCTATGTCACCAAGTC CTGCACCACTG 

CAGCAGCAGCATGGATGTCACCCGGAGCCAAGCCTGTGCCA CCCTTTACCTC 

CTCATGAGGTTCAGTTTTGGAGCCACCAGT GTAAGAGTTCAAACCAGCTGAG 

TGACCTGGAATCAG 

1 1 th exon (nucleotides 31046 to 31204) 



TGCAAGTAACCATGTCCCTGGCATCTTTGGTGGGAAGAGCACCAGACTTTAA 
TGAAGAGCACCTGAGAAGATCCTTGAGGACAATTTTGGCCTATTCAGAAGAG 
GACACAGCCATGCAGATGACTCCTTTTCCCACCCAG GTACACCGAAGCACAT 
ACCTTGTCTCATGCATGAGT 

12th exon (nucleotides 32755 to 32855) 

AGCT.VAGAT^ATTTTGAGnrTTArArTTTTTGCAG GTGGAGGAACTTCTCTGT 
AATCTGAATAGCATCTTATATGACACAGTGAAAATGAGGGAATTTCAGGAAG 
ATCCTGAGATGCTTATGGATCTCATGTACAG GTAAGCTTTCCTGACACACTCA 
AGGGACACCATTT 

13 th exon (nucleotides 33663 to 33855) 

TCCTC AAAA A rTTCTC A CTC A A TPTGTCTTC AG A ATTGCC AAG AGTTACC A 

GGCATCTCCTGATCTGCGGCTGACCTGGCTCCAGAACATG GCAGAGAAACAC 

ACCAAGAAGAAGTGCTACACGGAGGCTGCCATGTGCCTGG TGCACGCCGCTG 

CGTTAGTGGCTGAGTATCTGAGCATGCTGGAGGACCACAGC TACCTGCCCGT 

GGGCAGTGTCAGCTTCCAG GTAGGGTGTGTGCAGCTTTTCCCTTAGAGCAGTG 

GTTC 




CTGTTCTCCA^^^ATArTnTrrnTrTrTTTrAG AATATTTCTTCCAATGTGCT 
finAGGAGTCTGTGGTCTCTGAGGACACCCTGTCACCTGACGAGGATGGGGTG 
TGCGCAGGGCAGTACTTCACCGAGAGTGGCCTGGTAGGCCTCCTGGAGCAGG 
rCGCGGAGCTCTTCAGCACG GTCAGTGCCCAGAGGGCATCCCGGGGCCTGGC 

C 

15 th exon (nucleotides 40166 to 40297) 

A^TTCTCTCTnATGrTrTTrTrrTrTTTCGAAG GGAGGCTTATATGAGACAGT 
TAATGAGGTCTACAAGCTGGTCATCCCCATCCTAGAAGCG CATCGAGAATTC 
CGG A AGCTG AC ACTCACTCACAGCAAGCTGCAGAGAGCCTTCGACAGCATCG 
TTAACAAG GTAGCCGGGGAGCCTGGCTGGCAGGTCTTGTTAC 

16 th exon (nucleotides 40755 to 40889) 

T AA G G A G A ^ ^ TTTTT a T A TTTTG TT CCTCAG GATCAT A A G A G A A T G TTT G G AA 
CCTACTTCCGAGTTGGTTTCTTTGGATCCAAATTTGGGGATTTGGATGAACAG 
GAGTTTGTCTACAAAGAGCCTGCAATTACCAAGCTTCCTG AGATCTCACATAG 
ACTAGAG GTA^\GAAAAGTGATTCTGTGCGCCTGACCTGGTACACTTTAC 

17 th exon (nucleotides 42307 to 42396) 

AAC G TTT A T A A APTGTTGnTTC.TTCTTACCTAG GCATTTTATGGTCAATGTTTT 
GGTGCAGAATTTGTGGAAGTGATTAAAGACTCCACTCCTG TGGACAAAACCA 
AGTTGGATCCTAACAAG GTATACAAAAATTTACAAAAACTAACCATCAAGC 

18 th exon (nucleotides 45250 to 45486) 

TCTTCTGG^T^^^nrrnTTTTrrrrrTTAG GCCTACATACA GATCACTTTTGTG 
GAGCCCTACTTTGATGAGTATGAGATGAAAGACAGGGTC ACATACTTTGAGA 
AGAATTTCAACCTCCGGAGGTTCATGTACACCACCCCGTTCACCCTGGAGGG 
GCGGCCTCGGGGAGAGCTGCATGAGCAGTACAGAAGGAA CACAGTCCTGAC 
CACTATGCACGCCTTCCCCTACATCAAGACCAGGATCAGC GTCATCCAGAAG 
GAGGAGGTAATGCACCCAAGGGATTGGCCACCACTGGATGAGT 



19 th exon (nucleotides 48664 to 48807) 

ACAGTG.A^TT^^TATnTTTArGTrTCATGTTCAG TTTGTTTTGACACCGATTG 
AAGTTGCCATTGAAGACATGAAGAAGAAGACCCTGCAGTTAGCAGTTGCCAT 
TAACCAGGAGCCGCCTGATGCAAAGATGCTTCAGATGGTGCTGCAAGGCTCT 
GTGGGAGCTACTGTAAATCAG GTAAGCAAAACCAGAGGTGGCAGCTCCT 

20 th exon (nucieotides50892 to 50998 ) 

TATATTCTTTTTTTT^TTTTTTTTTTTTTTrrrArCAG GGACCACTGGAAGTAGC 
CC A AGTGTTTTTGGCTGAAATTCCTGCTGATCC A AAACTCTATCGACATC AC A 
ACAAGTTGAGGTTATGCTTTAAGGAATTCATCATGAG GTAAGAAGGAAAATG 

GCTGGGAATTTCAGTAGAG 

., ; ^-;t)S to 





TCAAAAAOAACTATAACAAGCTAAAAGAGAACCTCAGGCCAATGATCGAGC 

GGAAAATTCCAGAACTGTACAAGCCAATATTCAGAGTTGAGAGTCAAAAGAG 
GTAAGAACAGGGCAGAGGAGGCCTCTTCCTGTGGGAT 

22nd exon (nucleotides 63040 to 63294) 

CCTrrrTrTr.TTTTCTTAATTTCAG GGACTCCTTCCACAGATCTAGTTTCAGGA 

AATGTGAAACCCAGTTGTCACAGGGCAGCTAAGAAAAGCCATCTTCATTCGT 

GGAGACTGTGGCCCTGCAACCCTGGAGAAGGACTTGCTGGTACTTAAAAAAT 

GGGACATTTGCCACCCAGGACTGACTGTACACTCCCTGATCAGCCAGCACTC 

TGGAAGCTTTGGGATCCCAGGAACCATGGAATTATTCCCAAATGGACTCTGA 

CCAGATTTTTGCCATACTGGGGGGTGGCGGGATGGAGGATGGGTACTCAGGC 

ATGACTGCGTATTTATTAAAGTGTGTTTTTCCACAATGTACCAAACAAGGCAT 

AAGCAGCTTCTCCTGCTGACTGGCCAATCACTGCCCATCTGAGAGATGATTTC 

CTCTGGCCCATATTTGAATTTATTGGAGTAACTCAAATTGCCTGAGGAAAAAT 

GGAAAAATTATCCACCAGTCGATTCAAACTGAATTTCACTCTTTATAGGAAG 

GCAGGGCAAACTTGTAGGAGTACGAAACATTTTCAATAAATCTACAAAGGGA 

AGCCTTACT A CAATTCCAAAAATCATCATGGTTGGAAATTTGGGAGGAGATT 

ATTTGTGAACTTGTTACCCTTTTGGTAATGGTGGACTAATTGCTGTATAGTTAT 

TTTTGTTTTATTATTACTGTTACATTAATTTAACATGCATTTATAGAAGAATAC 

ATTCAAAGCACTGATGTAGGAGATACACGGTACTTGGAGCAGTCAGCCAAAA 

ATCACAGATACTGCTTTCACTTAAATGGAAACAATTCTCCGATAATGCTTTGC 



GTTTCTTTTTATAAAGCAATAATATCTCTGTTTTCATTTCAGAACATTGTGCTG 
TCTGTCAGCATATGTATATCAGCTACAAAATATATTCAACTTTGACTTCTTTTG 
ACAAAGGACTTTAGGAAAAGGAGGAACAAAGACATTATTTGAGAATTAAATT 
ATATATTTTTAATATGACTGTGACCTTGACTGATAATAAAGATGTAATAAGAA 
TTG C A A GCT A A AAA AAAA A AAAAAA AAA 




TTTTTCTCCTCTCTGGGACCAA 



\ 



IK, ^ 
4 of 4 



1 GTTCTCT GTGGTT AGTCAC TT AGT GACTTT AGAT AAGTTTTTC CAATTTT ATGGCTCTT AATTTC CT CAGTTTTAAAAT AAGAAGGGGGGG 

92 TT GAGAGATTTGAGGGC T GAT CAAC GAAAAGGAT AGGAC CAT AAAAAGCAGT GACAT ACAAGC TT CATTGAGCAGCAC TT GGACAGGGTT A 

CAT AAGAGC GGAAGC C C C T C C CAGCATGAGAACAGC CAT AGGC CT GCAGT GAGGAGGGGAC CAT C CAGAGGAGCAGGGGAACTC C CAGGGG 
AT AGAT C T GGGT GAGGC T GCT C CACAGCACAGT AGGGAGT C TC T GGGTCAGAGAGCT CCAAGGGC T G 



183 

274 AGAGGAGGATTAGGGCAGAAGCTX 



3 65 TAGCAGCT^AGGGCCCTGTATCTGCAAGGCTCTATCTTATCAT^ 

4 5 6 GGCACTAAATGGATGAAAATCTGCTTATATGAGCTATT 

5 47 AGCACCACTCGACAGTGAAGAAATAAACAACCGAG^ 
638 AAAGAAAAAGACATGACTTTTCTAAAGGAAGTCCTGTCT^ 

729 
820 

911 T AGTCT GTTTTCAAAAC ' 



AGGAAGCAGAAT AAT AAAAAAGAAT C CT AACX]AAACATCAGAAGT C C C CAAGCATC C C CAT C C CATGCAC C CTGAC CC C TGCC C T GCAGC G 
GATCTTCTGTCCCAGGACCCACC^GAATAGAATGGCAGAGG^ 

iTTTAAGTTGAGTCTATGAAAGATACCCTAGATCACCACTGCA^ 



T GTT AC TT AGCAC CATT AGT GGCAC T CAGGC C TCAGAAGGT CACTGAC C C C CATTC GT GGT GATTTAATT CATT GATC C CAGC T C T C T AGA 
CTTCATAC CTTAAGCAAGTTGTATTCTTACAAAGTC GTC TGAC TTTATCATTTTGCATAAC CTATTATGTTTTC TGCCAT 



10 02 CTTCATTCCCGACAGGGAAATAAGGCAACCCAAGCTAGATATCT^ 

10 93 TCTTTTCTCTTGCTCACTATCmCTCCTTAAATTCTTCC 

118 4 C CAAAGAT C C CT AGAC CAGGAGT CAAGAAGC C TGCAT GC T GATTC C T GGGACAC CATT AATT AC TTC T GT GATT GGGAGCATCAGC C TTT G 

1275 AGGCATT CATTTCC TCATC TGCAAAAAAAC T AGGCTGGATTAGATTTATC CAC TGATTCTGTGGTCTGTGTCTGCCAGTGACATC CACGGA 
1366 

14 57 TACAGGATGAACT 

I 5 48 CTAAAJ^ATTTTGTGGCR ATGGATATAAC CTTAGTCTTTTTTAATAGTCACTAAAAATAGACAAATCCATTTC CTCAATTACTGTCTGTCAC 

1 639 TT ACAAT GT GCCAC T AAGCAAACAC C CAT C C C TT GGGT AGGGGCATT GTGGGT C TT GAC CTTT GGGAAGGAAGTTTTT GGAAT GCAC CTT A 
1730 CTCCTCTTTTCAAAACATCACTCAAA 

1821 TGCAGTGGCACAATTTTGGCTCACTGCAAACTCCGCCTtZACAGCT 

1912 GTGTCTGGCACCACTCCCCGCTAATTTTTGCATTTTTAGTAGAGAC^ 

2 003 AGTGATCTGCCTGCCTCTGCCTCCCGJW7TGCTGGGATTACZAGGTGTGAGC 

2 094 CAAGGGAC C TAT GAAAGAT AC C CAT AGT GGGGCC TT C TTTT AAGT GC CAAT GT GTT GT GGGTTCAAGT T C CGAT AGC C GGC TT GAC C CGAC 

2 185 ACCTGTTAATGAGTAACCTAAGTGACAGGCACATGACCAAGTTCTAATC 

2 27 6 GAGT GCACAT C GAAAGGTT AT GGGAT CT GGT AAC T GT GC TT ACAT AGAAGT CAT AT GTT TT GGTTTT AAAAT AAT ATAT AATGGCAT TT AC 

2 3 67 TT ATTTT AAGTGGAT GT C T AACT AT GAATT AATT C T GTAGGCAAT AT GTC C CACAACACATT GGC TT CTT GT AAAATGGC T GAAAAT AT GT 

2 4 58 GTTCATTT AAATT AT AT TGTTTAGT C TGT AATCC CAGCAC TTT GGGAGGC CAAGGC GGGAGGATCAC C T GAGGC CAGGAGTTC GAGAC CAG 

25 4 9 CTT GGC CAAT AT GGTT AAAC C C CAT C TC T AC T AAAAATACAAAAAATT AGC T GGGCATGTT GAT GTGCAC CT GT AATC C CAGC T AC T C GGG 

2 640 AGGCTGAGGCAAGAGAAT C GC TT GAAC C T GGGAGAT GAAGGTT GCAGT GAGC C GAGAT CAT GC CACT GCACT C CAGTT CAACAGAGCAAGA 

2731 CTCTGTATCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTATATATATATATATATATATATACATATATATA 

2 822 TGTGTATATATATATATGTATATATATATATATGTGTATATATATATATGTGTATATATATATATATATAGTTTAA^ 

2 913 TGAAGT GAGT GAAGAGGAGGCAT GTC TAT CACAAGGATGAT GC TT CAT ATTTC T GT GC T GGGGT GGGGGGTGAT AATGAT GAAAT ATTGAG 
300 4 GAGCTCAAGGTCCATCAGCC^CCCTTCCTTCCCCTATTTTGCCm 

3095 CT GGGGGAAGCAAGGGGTC T C TT C CAGACAGT CAC C TTTTT C T GC TTTTCATT GCTT GC TT CAT GGTTT ATTTTTT AAAGGAAGATTTTT C 

316 6 CTAAAAACTCTTCTAGCTTTCTTTCTCTTTCTTCTCCATTTCCTC 

3277 GT ACTTT GCT AAACAGTTTGATGC C TTTC TC T AGGGAAC T GGTTCT ACAAC TTTC CAAT GGGGC C TTTAATTAGAAACT AC GAGAGAACAC 

3 3 6 8 TTGT AGTAT AAAAGT CAT C TAGT CAT ATT C TATT AGTTTCAT ACAGGC TCAT AT GAGGTCAAC TC CTTT CATTT AGTTTCTC GAACATAGT 

3 4 5 9 AGAGTTTTTGTAAAATTAATT ATGTT AC GGT GAAGAT GT AC C TCAAGATTTTCAGCACAGGC TTC CCAT GGT ATTAAAGATTTGAT AAAGT 
35 5 0 GGAAATC GGGAT AAGAATC C TCATTC TGGACAGCTACTAGGC T AGAATCAC T AAGGGGAACAGT AAT GAATGAAAATTT ATT AGAC C TCTC 
3641 TGT AAT GCAGAATTT AC CATCTGTT GCAC T AT CC CATT AGTTTTT ATT AAATT GATT GC CTT AAC CTGGAGAGAAAGCATATTTTTGTGT C 
3732 TGCCAACCTCAACTCCACTTACCTTGTAATAAATGTTTCATTCCT 

382 3 AAC GCT CATTTC CAT GAGGC T AGAGTTC C T GAGAAT C CT C T GT AGC CATGAGCATT AC C TCAAT C TGAAC CTTTTTTTT AT GAAT GC TGAT 

3914 AC TGCT C CAACAT C T C C TT GT CC TT C CAT ATC TT C C CAGGT AAT AAAAGAATT ATT T AACT AAAAGAATT ATT CAAGC T ATTT CATTTAAC 

4 0 05 T AGC TCAGTTTAAT CAT GT ATTT C C T AT AAAGGTT AGTC TT ATT AATTTCACAAAAAAT CAAACAATT CAAAC CAGAT CAAGT AT GC TAC C 
4 096 CTGAAGTT ACAC CAC TAGC TAAGAATTAACAATCT AAGT AATT GGTTTCTC C C CAGGC TCAAGGC TC C C TGATCAGGTT AAGT AAAGCCAA 



B 



4551 TCACTAGCTCAGAGCATCA^^ 

4 642 T C TT AC T AATTGAAAAAAAAATC TT AGC CAT ATATGC CAT AT GGCAT GATC CAGAT ATT AGCTACATGAC CATCTT AC T GT GAACAGGGAA 

4733 AGATCTGACTCACAAGCA.GCAATTCAAAATCT 

4 82 4 CTTACCCTAGGTTCTTTCCCACCTTC^ 

4 915 CTAAACACACTTTAC^C^TCTTAATOWVTATAAAGT^ 

5 00 6 ATACTTCC CTCCC T GACATCACTT GT AGTT C CAGGC CAGCAAAAGTCTGACAATGTGCTTAAGC C^AATTCAGAAGTGTAGCTGAGGCCGG 
5 0 97 GCAC GGTGGCTCACATTTCTAATC CCJU5CATCTTC GGAGGC CAAAGGGAGTGGAATAC TTC1AGGCAGGAGTTACCAGCCTGAC CAACATGA 
518 8 T GAAAAC T CATAT C T AC TAAAAAT ACAAAAAT GCAT CAGGTGT GGT AGTGT GAC TGT AATC C CAGCT ACTTGGGAC GC TGAjGGCATGAGAA 
527 9 TTGCTTGAACCCOSGAGAGGGAGGTTGCAGTGAGCTGAGACC^ 

5370 ^^^j^j^^AA^ 

5 4 61 CCCAGGACTG<SGTGAGGTTCTTATTTCTCTGTCCAACT^ 

5 5 52 TCATTCATTTTTCAACAGACCCGGGGTGCTT^ 

5 64 3 ACTCTX7TGCCTCAGTTTCCTTCTCCGCCTTATCTGGCACCAGAGTACC 

5734 CTCACTAAACACTGGCCATCG^TTATTTTCATTCCAGTTC 

5 8 25 CCAGOTCCAGGACCMA^ 

5 916 AC TGGCTGCTGCCCTGGATGCCC^GC^C^GGGTATG^ 

6007 AAATTT GCAGTC T AGCTT C T CACAC TTGGT AAAAAAC TC T AC T GT AGTTGAC CAGTTC T GAGGAGTAGAAACAT CT GT C TT GAGAAT AT GG 

6098 T AC C CAT AAGGACAAGGCACAAGAAAGGC C TTTC TT GTGT AGAAAGGCAC CAGGGAT GGGT AAGAAC TACAAAAT GACTTTTC TT GGTCAA 

6189 C T ATTT CAGT GGAAT TT AC CAGTT C T GC T AT AGCAGGTTT C C CAAGGAT GC TTT GATTAGT GAAC TC C C T AGGAGCAAAGC CATTTTTAAC 

628 0 AAAGGGGAT AGCAT GCAGAGGCAAC CACAAGATGT CACTT C-GTTCAAAGC T GAT GAAGGAAAT AATGGC TGC TGAGAAGGCAGC T GT C C CA 

6371 TGCCCAGATTAG GTTTCTTGCACAC^GTGCTTCT 

6462 T GATTACTCAGGGACAT GT GGCAGT ATC T AGC CT AGAAGT CAACAGACAGAGAGGT AGAC CAC C C CCTT CTTTC CTTCTCTC C CTATGCTC 

6 553 C GT GAGC T CATGGAGTCAGAAAC C CACAGC C TAT C T GATT GGACT GAAAAAGAT AAT GC CITCT AAAATATT ATTCATT C C GTTCAACAAT 

6 64 4 TATTGAATGCCTTCCATGGGACAGACACAGTTAGGTGT^ 

6735 tcaagac<:tcaccatccagggactc^ 

682 6 TAC agaattgcaggaacttc<:ccactaxtcca^^ 

6917 ctataccaaaccatctccctc^cagggaacctot^ 

7008 gcattgctctatctitatggatgga^ 

7099 ttcctc^tgaaacaaatacac^gcacatcgagttcgtgc^ 

7 1 90 T GAAGT GAC T CT AT GTC T GC GTC TC C TC^CAGGAT AGTGTGGGGAAT C TCAT GC TTTTAGCT C TCAATTCTGC CTC CTT C CAGAT AAAC T G 

7281 GCCTGAGTATATCCTTTGAGAACTTCACTTTCC^ 

7372 GT AGAATGTTCT AC C CTACAACTAACAT AAATTT C CACAGCAACAAAAAGT GCAC C GAACAC TT ATGC T AAT AAGT AAGAT AC T GAGAAGA 

7 4 63 AAGTTT GAACACAAAGAAAATTGC C TTCAT GCACAAACAT GT ACAT ACATTTTC TT AGTTGTC C TTT AAT AGCAGT AC TTTAAGT GATTT C 
7 55 4 T AGAAACATC TTT ACT ATTT ACAAT AGC GT AGTTTC T ATTTT C TATTTTCATT C T AGC T GGAAACAGC CATGACATTC T GTTC T GGATT C C 
7 64 5 TT GT AAAATT GTT GC TGTT AT ATT AC TAGCAACAAGGTAGAGT AT ATTCAGAGATAAT CAT GT AATT AT GTTT AAT CAGGT AGAT ACATT C 
773 6 TTCAAACACACACACACACACACAC^ 

7 82 7 ACAGTGT AAC TC C TTTGGCAACACAGTAAT C C CT GATTGC T GGGTT GT CAGTT ACT C T C CT GGAAAGT CATT AGAT ACAC TGT CACAAC TT 

7 918 CATCATGT AAGCT AAAAC C T CAGAAAAATGTCAT GCTTCAACAAT AC TTC CT ATTGAT CAGCACTTC CTTTTTTTTTTTCTTTT GC TTCCA 

8 00 9 AGACAGTT GGTAGAC TT GC CAACAC C TGT ACATC CAT GGGGCAAGGGC CAGAAGGAC GCATC T CAGT AC C TGAAC C CC T AGGGAGC T ACAG 
8 1 0 0 CTCCAGTTTCCACTGGCTTTCCCTTGGATCTCTGACACCTGG^ 

8191 CATTT AGAAATT GT C TAT CAT CTTT GTGTT AAGC CAGCAAATTGCAAT GC C TAATTCAAAACACAAC C CATT GC C GGGAC CACAGACTAAG 
82 82 AACAAGAAAAAC TTTTGAAAT GCAATTT ACAATT ATC TT AC TTTAGC CACAGT GCAAGAGT C T GAGT CATTT AAAATTTT GGTT AAT ATTT 
8373 TC T ATAT AAC GTTGAATT C T GAT GT AGC C TT ATTTT GTTT GAAAGAAAAAAAT GTAT AT AT AT GT AT AT ATTTTTT AAAT C TGGATTCTTT 
8 4 64 TTTrATTATTATCATACTTTAAGTTCTAGGGTACA 

85 5 5 CACCCATTAACTCGTCATTTACATTAGGTATATCTCCTAATGCT1GTCCCT 
8 64 6 TTCCCACCTATGAGTGAGAACATGC GGT GTTT GXTITT^ 
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9101 
9192 
9283 
9374 

94 65 TTT; 
9556 

9647 TCC 
9738 
9829 
9920 
1001 
10102 
10193 
10284 



AGAT AAAAAATAT AATATT C CATT ATC CTC TC TCAGTTACAGGCC C C C CT CAAGTAAC CAC T ATT CT GAC TC TT ATTATT AGAAATTAATA 
CTGCCTGTTGTTGAATTTCATAGTCTTTTGCTGAATGTTGTGACT 

CAAC TGT AT GTCAT ATT C C TTTGT C T AC T AT AATTT C TC TTC T GT AAATT GACATTT GGGC T GC TTT C T ATTT GT GGGT ATTGGGT ATT AT 
GAAAACAGC T GC C GT GAACATGC C T GTGCAT GGTTTT GGGT GGAC GTT AGAAC T CATTT CTTT GGGGC T ATAAAT ACAGC C TATTTTTT AT 
TTT AAT AT AC TGC T C TT GAAT AGTTT AAT AAATAT GT GT ACAT GGT C TT AACAAAAT GT CAAAAGAATAT AC T C T GAGC T AGGAAAAGAAG 
AGCAAACAAGTCAAAGCAC^AACATGC^ 

TC C GCAAGATGAAAAGC TTGTTATTTCAAAAGAGCAACAAAATTCACCAATCTTTAGC 

TT AC TAAAAT CAT AATT GAAAGATT CAACACAAT CAT AT CACAAGAGAC C TT ACAGAAATAAAAAGGATT AT AAAAGAAT AC GAT GAACAA 
TT GAAAGC CATCAAATT GAT AAC C T AGATT AAAT GGATAAATT C C TT AAAAGGT ACAAAGT AC T AAAATT GAC T C CAAGAAGAT AT AGAAA 

9 9 U O ATC CAAAT AGAC C T ACAGAAGTAAAAAGATT GAGTT AGT AAT CAAAC TTC C CACAT ACACC T AC TAT GT AC C CACACAAATTAAAAATTT A 
10011 GGCTGGGCGCAGTGC-CTCATACCTGTAATGCCAGCACTTT 

G^3C CAAGGTGATGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCT 

T GAGGCAGAAGAAT CATTT GAGAC T GGAAGGCAGAGGTT GCAGTGAGC CAAGAT CAT GC CAAT GCAC T C CAGC C T GGGCAACAAGAGCAAA 
iUjyq AC T C CAT CACAAAT AAT AAT AAT AAT AAT AAT AT ATTTT AAAATTT AAAAC TT C CT ACAAT AAAAGC T CAAAC C T GCOSGGCTTT AC TGAT 

10 375 GAATTC T AC CAAAT ATT T TT AAAAGAATT AATTC T AATTTTTT AC CAACTT C CAGT C TT CTC TT C CAAC GAAT GGAAGAGGTGGAAT AC TT 
104 66 C C C CAC TT GTTC TAT GAAGC T AGCATTAC C C TAT AC T AAAC CAGACAAAGACAT CAT GAGAAAAC T ACAGGC CAGT AT C T GAT GAAT AT AG 
1 0 S 5 7 AT GT AAGAC C CT CAACAAACACT AGCAAAC T GAAT C CAACAG<1AT AT AAAAAGGATT AT ACAC CATGGC T AAGT AGGATTT AT C T CAGGAA 
, n .- m o nr^-K ^ ^ AGCC ^ r "CA T AC C T GAAAATCAATT GTT GT AC CAT ATT AAJTAAAAT AAAGG A r AAA A C n n A T A r A A T CATC TT AGT AGAT GCAA 

AGAAAAGCATTT AAT AAAAT C T AAT AAC GC TT C C T GATAAAAACAC TCAACAAAC C TTTTAGGAAAT AAGAGAAC TTC C T CAAC TTGAC TT 
AAGGGC CTC TAT GAAAAAT C CACAGC TAAT GT GACAC TT ATT AGT GAAAAACAGTGC TTTAT C C C TAAGATT AGGAACAAGACAAAAAT GT 
C T AC C C TT GC CAC TTC T ATT CAACAT ATT AGGAGTT C TAT C T AGGC-CAATT ACOC7\AAAAAAT AAAACAAAAGACATC T AGGC CAGGC GT G 
CTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGG^ 

GCAAAAC GC CAT CTC T ACAAGAAAT ACAAAAATT AGC TGGGCATT GGT GGC TT GT GT TT GT AGT C C CAGC TAC TT GGGAGGTT GAGGCT GG 
±±i?H AGAATTGCTT GAT C C CAGAAAGC GGAGGTT GT AGT GAGC T GAGAT CAC GC T AC T GCAC T C CAGC C TGGGC CACAGAGT AAGAC C C T GTC T C 
11285 AAAAAAAAAAAAAAAAAGAAAAAGAAGAAAAGAAAAGAAAGCAT^ T ATCTCTATTCAT AGGTGGCATAA 

1137 6 T C TT GTTT AT AGAAAAC CAT AAGGAATC CACAAAAAACT C CATTACAACT AAT AAAT GAATT CAGCAGT GTT GCAT GGT AT AAGAT CAACA 

114 67 T ACAAGAAT CAAT T GTGTTT C TAT ACAC TT AC GAT GAGCAAT C TGAAAAT GAAATT AAGAAAACAATT T CAT AT AAAAT AGCAT CACAAAG 

115 5 8 AAAAAAT ATTTAGGAAT AAATGT AACAAAAGAAAC^ T AAAAAT GACAAAACAC T GTTGAAAGAAAGATAT AAATAAA 
1164 9 TC^GAGGAT ATCAT ATGTT CATGAATGAGAACACTT ATTATTAAA^ C C GAAATTGATCCATAGATT AAATGCAGTTCTTCT 
1174 0 CAGAATTCTAGCTTC<nTTTTTTTTTTTTTGC^ 

11831 AAAACAACCTTGAAAAACAAGAACAAAATT GGAGGACTCACACTTCC CAATTTCAAAACTTACTACAAAGC^ 

11922 GGTT CT GACATAT GAT AGACATAT AGAT CAAT GAAATTGGGTT AAGAGTC CAAAAAT AAAT C TT CAT ATTTAT AGT CAATT GATTTTTGAC 
12 013 AAGAGTGCCAAAACAATTCAATGGGGGAAAAT AGAATTT^ CACAC TCAAAAGAATGAAGTT 

1210 4 GGAC CC T AT ATT ACACT GT AT ACAAAAAC T AACT CAAAT AGAT CAAAGAC C T AAATGT AAGAGC T AAAAC TAT AAAATT GTTACAT AAAAT 
12195 T AT AGAGGT AAT CAT CAT AGACTT AGAAAAGGCAGT GGTTT C TT AGAT AT GACACAC T C GAAAGT AT GAGTAACAAGAAAAAAAT AGAT AA 
12 2 8 6 C T GGAC TT CAGT AAAATT AAAAC TTTTGT GATTT AT AGGACAC CAT CAAAAAAAT GAAAAGGCAACACACAAAAT GGGAGAAAAT ATTT GC 
AAAT CAAAAAC C T AAT AGGGGAC T T GTAT C T GGAAT AT AT ATTTT AAAAT C TT ACAAC T CA.GT AAT AAAAAGACAAAT AAC TCAGTTTTTT 
AAAAGGCAAAAGAT CAGAATAGACATTT CT C CAAAGAAGAT ACAGC CAT AAGAC CAT GAAGAT GTTCAGCAT CATT AGC C GTCAGGGAGAT 
GCATATT AAAAT CACAATT AAAT AC CAC TTGATACC CAC GAAGAT GGATATAATAAAAAAGACAGGT AAT AAAGTGTT GGCAAGAAT AAAA 
UMU T GGAGT C TT CAGACACT GC T GGT GGGAAT GT AAAATT GT GCAGC CAC C GTT GAAAACAACT T GC T GATT C CT C T AAAAGTT AAACAGAGGC 
12 7 4 1 T GGGCGC T C GGC GGC TCAC GC CT AT AAT C T CAGCAC TTT GGGAGGC T GAGGT GGGCAGATCATTT GAGGC CAGGAGTT C GAGAC CAGC C T G 
12 832 GCCAAGATG/TIGAAACC CT GTCTCT ACT AAAAAT ACAAAAATT AC^CAGCTGT GGT GGCAGGT GC CT GT AGTCC C C GCT ACTTGGGAGGCT 
12 923 GAGTCAGAAGAATTGCTTGAACCCAGGAGGT GGAGGTT^ 

13014 AAAAAAAAAAAAGTT AAACAGACAGTT AC CAT ACAAGC CAGCAAAT GT AC T C T GAGGT ATGT AC C CAAGAAAAGT AAAAC TTAAAAC TT GT 
1310 5 AT ACACAT AC TCAT AGCAGC GTT GGT AAGT CACAAT AGC T CAAAAGCAGAAACAAT C CAAAT GTTT AT CAGTT GAT GAAT GGAT AAAATTC 
13196 AC CAAT GGAAT ATT ATTT AGCAATAAAAAGGAAT GAAGT AC T GAT GC T ACAAT ATGAT AAAC TT AAAAACAT CAT CXTT AAACAGCAGAC C C 
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10921 
11012 
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12377 
12 4 68 
12559 
12650 



B 



13 651 TGTGCCAAC^GAATCAGCAAACT^ 

13 742 AAC CAGAGGT GAAGGTCAAAATC GC C GC C C TTT AC C T AC C TTT AGTT GGCATCATTTT GGAT GC TTT GC CACAGCTCT GTGAC TTT ACAGG 

13 833 TAATGGCCCTTCTGTTTTCTTTCT^ 

13 924 ATATOTCATATGAGCAATTCrrCACAAAAATC^TCTCCAAACCT 

14 015 AGC C C CAT ATGT GTTTTTT GTTT AGAAATT ATTT AC TTGTTT C IT CATTCATT CATTT ATT CATT CATT GATCTGC TAGT ATGGAAT GAAC 
1410 6 TTT ATAAAC CAT ACAC CACAAAGGAT GAAAT AAT AACAAGATT AC T GC CTTT GGAGAT C GT GC TC TAACACCTT GAAAT AAAGGT GTTC C T 
14197 TCCCTTTTCTTTTATC^TATCTAAQCAAAAAAGCGCTCT 

14 26 8 CAAT ATT AGAAAATT CT ATCATAGCAGAGAAATATT GC C C TT GGCAC TAAGTGACT AATTTT GAGCTTAATAQ > CTCTAGATTAC<1AACAAG 

14 379 AT GAAAGGAT AAGTTGT C TT AAAGGGGTTTT GTGAAAC C C CAGAAT C T ATTT ACAAATT ACATT GTGGATTAAGGAATCT AGAGGGAAAAA 

14 47 0 AATCCTCCCAAATAGAAGAACTTCTTATGTTGACCTGAGAAAG^ 

14 5 61 TGTG cc^c<;ctc^cacacacacacacacaattc 

14 652 TCC atc<:tgctctcttggcagctgaaggct 

147 43 ttggaaccatccaggttttcccacctcctccaaatgcww; 

14 8 34 AGTTCATGAAAATGC<^CCAAACGCCTATACTCTTTAA^ 

14 925 CTTTC TTGAAAAAAATCCTCTAGCAAACTC^ 

15 016 GGGGC CAGAGTGCAAGGAAAT GAT AGAAACAAGC T CAGCACACAGGGC TGAGCAAGC T GCAC CAT GAGGT CAX3CAGCTT C CTC T AGCAGC G 
15107 C C TGATTGC GC GGAATT GAAAAT GGAGTT GTTTTT AACATTT ACAGACAT AAT GCAGAGCAT GGCAT GT GAC TT GT AGC C CATTTTGAGAA 
15198 TC C TTGA.T GGC AA (^TTT C. T AAAAAGTTC CATTTCAAGAGC TT GCTT GTTC C CAAAGC CAGGAAC CACTTT AC^IACACATT ATC C GAAGTTT 
15289 T CTT GT CAATTAGAATATT C C GT AT CAGT GAC TC GGAATCAGAAC TTTTCAACATTTGGTC TC CAAGC C TTTT AAACC T CAAAGAC TTC TT 
15 380 TGTATATGCGTTGAACTTTATCATCATGATCTAGCTGATGAGAGAA 

15471 C T ATTAAAT ATCGAATC TTT AGGTT GAT AAAC TCAT AAGAAT ACAGT C TTT CAAAAAGATGCAT CTGAATTCAGATTC CAGC CC CATTT AT 

15562 TAATATAAGTGACCTTTGW^GGCTCAACCTTTCCGTGC 

15653 GTTT AC GAT AAAGGT GAAAC T AGAGAAC TT C T AT AAAAGCATTTT GCACAGCACAT GTTTGT ATCTTTC C C GATTTTT CTT GT AAC T AT AA 

157 4 4 C C C TAT GGCAATTAAGGGGAAAT AAGAAT GT GTC T C TAT GTT AGTT GT GAT AAT GTT AT CAGGT C TT GCATAATTT CCAT GTGCT GTTT AT 

1503 5 TTAACCATTT CTTT AAAATCC CAAT GGCCTT AAT AAGTATT^ 

15926 GCACTTTGGGAGC^CAAGGTGGACAGATCACTTGAGC^C^ 

16017 ATACAAAACTTAC^CAGC^GTGGTGGCGCACCTGTAATCCCAC^ 

1610 8 AGC GGAGAT GGC GC CAC T GCACT C CAGC C T GGAT GAT AGT GT GAGAC TTT GT C T AAAAAT AAAT AAAT AAAT AAAT AAT AAGAAC T GGGTT 

16199 TT GTTTT CACAAC TTT AGT AGAAGAAAT GTT ATTT ACAC TCAAATTTTTC T AAATAATT GAAGGC C CAGATGGC T GTAAT GTCAACAGGT C 

16290 T AGAAAAACATGAATTTTT AGGAAAACAT GAGTGAAT CAAT CAGTT GT GAAT GTTTT AC CACAC TTTT C C CAGAAC TGC TT GAAAT ATT AA 

16381 TAGTTTTT GATT GT ATAC T GGTAGCATT C TT AAAAAC CAGACTTTT AAAC GGTTTAT C CTTTTTTTGCTT GCAT ACAT CT AGAC C T ATC TA 

16472 TA AATATC<XJrATGTCTATATTACATAGA^ 

165 63 T AAT AGC TT ATT ATTCT AT AT AAGGT AGTT GC TT AATTC T GT AATT GT AGGT GT CTTC T ATTT C-GTCATT ATTTAAAATAATGC CAATT AT 

16654 TAGAAT AGAGAAT GAAGTTT AAAAAATT AT GTT ACAGGAAACAATT ATGGAAGGTTT GAAAAC TTTTTGTTCACACAATTTGAAAAATT AA 
16745 TTTCTAGCCTAATCTTGTGCTAGAC^TTGTCTCTTAGCCTGCTGTGTT^ 

1683 6 T GAAATT AC T GT GC TGAC TTT AGT GACT GAGAAGT ATCAGT C T CTT ATTGGGT AGGGGACAT GGGGAAAT GT CAT GTTT GACTTGACAT CA 
16927 CAAAC GAT GTTTT CATT GCAGTT GGAGAT AC T CGCAGAT AC C GCAC CAGT GGC T CGGAT GAAGAACAAGAAGGAGC CGGTGCCATT AAC CA 
17 018 GAAT GT GGC T CT GGC (3AT AGCAC-GGAAT AATTTCAATTT GAAAACAAGTGGAAT AGT GC TGT C TT C C TT GGT AT GTTGGT GCACAT GTGT C 
1710 9 TC<7ITGATTTTT CATTT CATGTCTTCTGTCTTCT 
17200 GCAGTCX^GTGGCGCAATCTCAGTTCA^ 

172 91 GCAC CAC CACAC C T GGC T AATTTTTTTT GT AT CTTTT AGT AGAGAC GGGGTTT C GC CATGTT GGC CAGGC TGGT C T CAAAC TC C CAAC CTC 
17382 AGAT GGT ACACCCACCT CAGC CTC CCAAAGTC<TGGGATTAC^ 

17 473 ACTCAGC TT AGAGCAGGCATT GATT ATTT C C GC C T CATTTT GCTGGAAAGAATCATT GT GAGGC T GGCAGCT GAGGTGCACACAAGTCAGA 
175 64 AAGGGTT C T GCCAACAGCAAGAGAT GTC^CAGATCAAATT CAGGCAGT C CCAGT GGT GCT GAGTT CAAGTT CAAGAAAC C GGAAAAGAGGAA 
17 655 AACTTGGGAAATCTGAAATCCTTGCAAAGGAG 

177 4 6 ATTCAT ACAGTGT AATGC CTC CAAAAAAAAAGAC T GAAGGAGGCT GGT AGC TT AGC GT AAGGGC TTC TT AAACAAGAC T GCAGC TTTTCT A 
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18201 AAAGACATTT GAGATTT GACATCAC C GATCTTACTGTTGGAGAAAT ATTGTGTGJ\C CTC CTC CAC CTTTCTAATATCC GTCTTAC CAAAGT 

182 92 T AGC TGC TT GAGGT GGT AT AT GC C TTTT ACATT ATTT GC C T C T AAGGGAAAAAC TCAAAAGC C CAAAGTTCAC C T GTT AGAACAT AGTC C T 

183 8 3 T GTGAGGTT GTAT C T CAAGATTT C C TTT ATT C TT GT CAACAAACT CAGAAT AAC TAAAGTT AAAGTT GC TTT AT AAC CTC T ATT ATTTC CA 
18 4 7 4 T C CAAAAC T AACATTTC C C T C C CAT AC C CACAAATT C CT CATT GCAATTT CAC CAGT ACATTC CACT GGATATT AGCT AC GCT GCAT GAAC 
18 5 65 CAAGGCTC^GCCTCATTGTTCTTGTTTCT 

18 65 6 AT ATTAAAGCAAAT AGC C T GAAC C C C CACAC C C CAGC C C CAGGCAAACAT AAATTAT GGTT AAAC TTC CATT ACAGAGAAC TC CACAAACA 
1874 7 TGGATTTGATTAATTTGCTGAGCAGTO^ 

188 3 8 TTCATGC<^CCC7VGCCAGTTTCCTTCTCCCCTGTAAGTC 

1892 9 TGAAAGCCTCTTTGTATCACTGG<^TTGCAGCAC GCATGAT CAAGGC C CAGGGGTGATCACCAGGCCACACTGCTC CT AAGACAGAGGTAC 

19020 T CAGATACGTGGCTGAAAGCCTAG<:TCAAATACT^ 

19111 C T AGGGAAGGCAGGGCT AC T GGCAAT AGAT CTC CAGC CT AGCAGT GAT GT ACAGTCATGGT ATTTT AAGAGAACAC TTT GAATTTTTCT GT 

19202 TGCTTGACTGTTAAC^CTCAAATTTTTCTOT 

192 93 T A(^ACATGCTGAACGCGGACACTACTCGCAACCTCATGATCTC^ 

193 8 4 GGATTGC T GAC C T GC CAT CAAC GCAGCT CAACAGGATTTT AGATC T AC TTT T CATC T GT GT GTT ATGTTTTGAGT AT AAGGTAAGTC TGGA 

194 75 GTGGCACAAC TTT AT AC CAGC TCTT ATCTCTCAATT GCAATTCTGTC TTCTTAC TCATC CC C TTTGTTT GGGC CATGGAGGGATCATTAAT 

195 66 TTTT C TCATTTC T GT ATT CAAAT C CAT AAC C CATTT GT AGGT ATAGAT AT GAT CATTT CACAGGGAAAGGAT CTCTGCCTTCT GCAGAGAG 

19 6 57 AAC C C CATTT CT GTT GACAGAGT TTT GGC C CATAGGATGC T C CAGAGCAGCAT C TCAGT GAAGCACAT GT CAAAC TT AGC TGGCAT CAC T G 
1 974 8 TGGAGTGTACTGTTTTGGTAACTCTCCCCAT 

19839 GTT C TGACAAAGT CAGT AC C CAAGT C CT GCAGAAGT CAAGGGATGT CAAGGC C C GGC T GGAAGAGGC TTT GC T C C GTGGGGAAGGGGCCAG 

19930 AGGGGAGATGATGCGCCGCCGGGCTCCAGGTGTGTTC^CTGGC 

2 0021 ATGTC C T C C CAACAT GATT AGACAC CATT AC TTT C TT GAGAT ATTT AC GGT AGT GT CAGAGACAGCGGATTC T GGGAGT C T GT GT GT GACA 
TTTGTGTTAGCCCTGTGCCTGTGAGGGAAAGGGCTGT^ 
CTCOTGATGGTAAAAGGTCACCTCTAACAATGTC^^ 

CTCTATCCC^GCATGCTCAAGAAACTGAACAGTTCTG^ 



20112 
20203 
2 0294 TTTGGCTAC 



2 0 385 AACCAGTTCCCTCAAATC^TTCACAGTGTCCCAGTCATTC^ 

2 0 4 7 6 AAT GAAAT C C C CAGGTAC C C T CAGT C TT ATTCAC CAT GC T CAAAGT AAAACAGAGT GACAGC TT ATT GT ATT C GAAGGGACACAGT GGCAG 

2 05 67 GGAACTT GGAGGGAGCT CAT AGTTTT CAGT GGTG-GTTCAGGCAC C C TCATTT GACAC C CAT AC TTTCAT AC C CAAT AAT T CAGT AAGC C C C C 

2 0658 AGAGTTTCACAGGAATC C T C T GC T CATGAGAAAT GT C TC T C T GACAC T CAGAAAGGCAGAGGTT C TT C TT AT ATT C TAGT C TAT CAATCAA 

2 074 9 AGAGCGGGC CAAC TT GGACAT AGGT GTGGC GACTTT GTC T C C T AC CAGCAAC C T GCAT GGAC T C T AAT T AGC C C GAGAAAT GGT GC T GAGG 

2084 0 C TT C TCAGTT GAGC TTGTT AT GAAC TTC T GGTT AT C TTGGAGGGT TTCAT GC T AAT CAAATT C C TAT CAT GCATTT CTT AACTC C T AGGGA 

20931 AC GACC GATTTC CAGGC C T AAATGAAAATTT GAGAT GGAACAAAGAGCAGACACAT T GGC GGCAAGC T AATGAGAAGC T AGAT AAGT GAGT 

21022 CAC TC QGCAACTTT C TGCT AC TTTT AC C T AAAGTC CAAAAC T ATTTTT C C CAGGCT GC TTGT ATT AC T GAAACAAC TGCAT C C TT C CAAGG 

21113 GTT AGAAAAT GAAACATCATT AT C T GTGT AAATACAATT CAT C CAGGGAC C CAGGAT AATCAAAGCT AT AGC<^GTTGT GGTTT C CAGC T C 

212 04 TTAAAAATTT CTCACATTGATAAGCATGTAAAGAAAACT AAT ATTT CTTT AGCAAC CTCAGAT GGCTTAATAAAAGCAGCTGATTTTGCAG 

212 95 GGAGGGT AGCAGGGAAAT AGAGAAAGCAGGACAC GGT GC C T AGGAC C GTAT AC TTT CAAAT C GAT ATTT C CTTT C T GGAAATAT GT ACAAG 

213 8 6 AT AT ACATT CAGAT AT ATTT ATGT CAGT GCT ACTT AAAGTT GTTTTTT AAAATT GAAAACATT C T AAAT GCT C CAGAAT AGAAAAAT AT AT 

214 77 XT AAAAGTT GGATTGC CAT CAAAATGTTTTAAAATCATC TTAATGACATGGGGAAAT GC TT AT GACAT AATGTT AAAT GAACAAAGGCAGG 
2 1 S 6 8 GTAGACACTT GGTTTTCT AGT AGT GTGT GGC TCAC GC CT GT AATC C CAGCAC TTTGGGAGGCTGAGGC C GGT GGAACAC C TGAGGTCAGT A 
2 H. 5 9 GTTCGACACTAGCCTGGCCAACATGGTGAAACCCCCTCTCTACAAAAAATAG^ 

217 50 AGCT AC TCAGGAGGC TGAGGAGGAAGAGTC GCTTGAACC C GGGAGGCAGAGGT T GCAGT GAGC CAAGAT CGTGC CATT GCACTC CAGC C TG 

218 41 GGT GACAGAGCAAGACT C CAT CT CAAAAAAAAAAAAAAAAAAAGAAC C TCAAAT CT C T GAT GACAT AT AGTCAAT GT ATTTTTT AAAGGC T 
21932 AGAAGGAAATTT AC CAAAAT ATT AATTGC GGTTTT C TTT GGGT GC T GGGAAT AT C GGT AATT GAT GTTTT CT C TTTTAT AC TTT GC TAT GT 
22023 TT C C TACAGT AGATGTCT ATT AT AAT AT AT ATTGTT ACCAGAAA^ GAAAGAAT AAAAAAAC C T GTTT GGAT CAAA 
22114 ATCTATGCTATAGATGAATAGAT ATATTGAC GTGT ATT GC TAT AT AT GCAACAATTT AGACAT AT GAAAGCT AAAAGT AT ATAT GTTTGAC 
2 22 05 T AGATAGCACAT GTTTC C GATAC C GTTT AGC CACAGGTAGC C T AAT GAGT C C CAGGGT TTGC T GAGCAAAAT GT C GTACAGCT AAAATCAC 
2 22 96 CC CTCACACACAGTTGACAAGGT AGGAT AC C C CAT GGGT GAT GAAGC AC C CAT CAAAGGGGGTATGGGGCAGACTT GATT AGCAGT GCTGT 
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22751 TTTGCCTGGGCTCTCTAAACTGGTGTT^ 

2 2 8 42 ACCGGTTTGCTCTAGGTTAGTTCTCAGCCCTGCGCTACCTGGTGGGCTGCT 

2 2933 GGC TTT GT GC TTTTTTTTTTTTTTT AAAAGGGAAAC CAT AGT AAAATT AT AAC T GTTCAAC CTAC CT AC CTGGAAAATTT AIAATT ATAAT 

2 3024 AATCXTTGTCTTTCACCCTCTTTCTGATCATTTTTCG^ 

2 3115 AGAATCACCTGGGGAGC TTTAAAAC C TGTCAGTGC C CAGC TGCAC CCTAGACTACTTCAATTGGAAACTCCAGCAGCACC CAGAAATCAGT 

2 3206 ATTTGGTAAAACTTC C CAGGT GATT C TATTGC CAC CAAATCCATTGTTTTAAAGGAATAGATGGT^ 

2 32 97 TTTCITTTTTTXjAGACAAGGTCTCACTCTCT 

2 338 8 CAAACAATCCTCCCACCTCAGCCTCCCAAGTAGATGGGACTACAG^ 

2 3479 A GGGTTTCACCATGTTtSCCCAGGCTGGTCTAAAACTCCT 

2 35 70 AT GAGC CAC T GC GC C TGGT CAGAT GGT AAAGC TTTT AAAAAAC CAGATTAGT GTTAGTGATGGTT GCACAATATGAAT GT ACTT AACAC TA 

2 3661 C T GAAC T GT ATAC TT ACAAAT GGTT AAGATCX3TAAATTTTT ATGTT AGGT GT AC TTT AT CA.GAAT ACAAAAATTTGGGAAAAAACAGATTA 

2 3752 GGACAC T C T AGATTT GT GC T GTC CAATACAGT AGC CATT AGC CACAT GTGAC T ATCAAATGC TT GAAAT ATGGCT AGTT CAAATT GAGAT A 

2 3 8 4 3 ATAAAT CAAGATAAGTGAAAAAT AC GCAC CAGATTTT GAAGGC TTATTGT GAATAAAAGAATAAAATATTTCACTAGTAATTTTT AT ATTG 

2 393 4 C T T ACAT GGACACAGTATTTTTT AT C TTTT AC^^T AAACAAAATAT ATTAAAAATAAITTCAC TTTTT CTTTAC TTTTTT AAT GTGGTT AC 

2 4 02 5 CACAAT AT AT AAAAT GACAT ATGT GGC C C C CATT GTTTC T AC T AGACAGCATT GCT C T AAATTT AAAAC T AAAC T AGCAGTCAATT AATT A 
AAAAGGAT GATAAGGGGC C GGGCACAGT GGC T CACAC CT GT AATC C CAGCAC TTTGGGAGGC C GAGGT AAGT GGATCAC GAGGT CAGGAGA 
TCGACACGATCCTGGCTAACC^GGCGAAACCCCCTCTCT 

2 4 298 GCTACTCG^GAGC^TGAAGCAGGAGAATGC^GTCAACCCC<^^ 

2 4 389 AGC C TGGGC GACAGAGGGAGACTC C GTC T CAAAAAAAAAAAAAAAAAAAAGGAT GAT GAGGTT AAAAT GGTAAATTTGAT GTT AT GT GTAC 

2 4 48 0 TTT ATCAC GATAT AAAAATTT GAT GGCT CAT GC C TGT GGT C C CAGAT ACT CAGGAGGC T AAGGCAGAGCATCACTT GAGC C CAGGAGTT C G 

2 4 571 AGGCTTCAGTGAGCTATGATAGTGCCACTGCACTCTAGCCTGGGTGATACA 

2 4 662 TT AGAITTCATTT ATTTT ACACAT AT ATT AT CAC TT GGAAAAT GAGAAAAAGT GTCAAGTGGC TTGGGAC CAGAGAGC CTATCCTAAACAT 

2 4753 GAAAACAAGT AAAACACACAGAACT^ GAGT C CT CAGT GGT AT GT AAGCAGC T GCAGT GC C CC CATT ATT AGGTT AAT GGGA 

2 484 4 CGCAAGAACAGGTAAGT GGT AAC CCTGGC C CAGGACATATGAGCTCATATAATGAT ACCCCAACC CCA^ 

2 4935 CAT C TT GGT AAAGT CAATT C TTCAT AC C T C C C TTT C C TT GCAACT AGATTT GGATGAT GAT ACAAAAT ATCC C TTT ACAGC TT CAC TTAGA 

2 50 2 6 TTTCATAAGiAATGGATGGGCTAC^ZAAAAAAAACCATTCTGATrCCT^ 

2 5117 T C C C TAGT GGGCAAT GAC CAGT AAT GTC C GGCAGGAT ATT AGATCAC C TGC C C T ACAGGAAT ACAGTC TTGTTTC C CAAT GGAAAAGGAC G 

2 5 208 AAAGACCCCACGCACTTC^3CTCAGCAACCTCAAGGTGAT 

2 52 99 CCTTTCCTAACCCAGCCC<^T(^CTGCCAAGTTCACATC^ 

2 5 390 TC T AGCAAT GGTTTT CAT GC GTGAAATACAGC CAT GGC C C T GAGGC TTT AGGCAACAAT CT GAGAGGGGAGC TT AATT GC T AGT AGCAAC T 

25 481 AATAACTC^ITTCTCTACCCATAGTGTTA^ 

25 572 CT AGACAGTT GTTT C GACAAGACAT GAAT CACAGAAGGCAC C T GCAC T GT AGTT AC T CAGGGC CAGTT GC TCT GTTTT CATTT CAAGGTT G 

25 663 AC T ATTTT GGAGATTTC TTT ACAC C TTGGT GT ATAGATT GC CATCAT GGGAAC C TGGC CAGGTTT GACAT GC GCTTTAATTTGAC CT CTT G 

25 75 4 TT GTTT C C T AGAACAAAGGC C GAGTT AGAT CAAGAAGC C TT GATCAGT GGCAAT CT GGC TACAGAAGCACATTT AATCATC CT GGAT AT GC 

2 5 8 4 5 AGGAAAACATT AT C CAGGT GAGGAAAACAAACAC C CAAT C T GATTT GTTGGC CATGAAT AT GTTT AC T AGAAT AAGGAC TT CTTT AT GCAA 

2 5 93 6 AATT GT GAAAGACAT AAAT GT GAT C C CAT AGT AC C TTTTTT AAAAAAATCIAAGTTGAGAACTTT ACT GT CTAC C TTAT A 

2 60 2 7 AATTCCAGAGATACCAAACATTCTTCTGGCTTCTTTGACTTAG^ 

2 6118 AAATCTGTGT GAGT ATGT AC CAAGTTTAT AAT ATGGATGTT GGGTTTATC GTTT AGT ATCTAGAACAGT AGT GGT AAGT AGAATTTTTTC T 

2 62 09 GAT GGGTCAACTC CACTT GAATGATGGTCAC TGTC TGAT AT GGGAGCT ATGATT AT GACTAGGCTAGGT AAAAAGAGT GCT AAATTT GACA 

2 6300 AAT GATGT C TTC TTT GGAC TT AAAT TTGTT AAGGAAAGT CATTTGT AC CAT GAATTT GC CAT C C C TGC T GTAGAAAAAT AT AGC TTT GT GA 

2 63 91 AC TTTGTAC CAT AC TAATTTTATCTTCT ATGT GATT ATTTC CACAAATTC C CAAGC T GTCT AGCT AATAATGAGTTTTTAATT AC C C TGAA 

2 64 82 AAAT GAGTT C TT ACATGTTT C CATT GAGAAGT CATT CATT AGAGT AGGTC CAGGATT GC TTTT AGGGC T AGAAGAAAT AT C GTT GAAACAC 

2 65 73 AGT GAAAT C TT AATT CT C T AACTTTT GAATT GTC T AAAAT CAAAGT AATCAT CATACAAAAAT AAACACAAAAAGT ATGT GAT ATTTTT GT 

2 6664 T GAC TTT AAT AT C TTTGAT AACTT AAAT GCTT GGT AT CACATTTAC C TTAT C TTT AT AT AGCACAAT ATT AGGT GC CAAAT ATC TAT ACT A 

2 6755 GC C C C CAAAT AT ATTTGCAGTTTT CAAAGAAAGC T GAAAC C TTTT GTT ATT AT C CTT GGTGTT GTT AGT C CTTC T GTAGGT GAT AAACAAG 

2 6 84 6 CTTCTATTT AGAAACATTGCTGC CAC CAAGCAGCC C CTGTT GTACTGGGAAGC C CACAATT GTGTTTT GCATCCCATAAGGAAAGCTATGT 



24116 
24207 



B 



27301 GAATTCT CTGAACT GTCATCAGAGT ACCAC CTAC CTGAC TCACTGCTTTGCAACAC T C C GT GC TCTCATC GC CAAGGT AAACTT GGGAT GC 

27392 TT GTTTT C TT C C T C TTAATT AAGAGT AAGATT CT CATCT AGC TTCAT ACTT C TCTC TTCAGGTGGAC CAAAAGTCACAGAGCAT ATT AAGT 

274 8 3 GGCATCACAGTAAACCTCTTAAGTCTTCCTAGGAAGAAAGCAGATGC 

2757 4 CCCATATTTGAAGTGTGGACCTAACTCTAGAAGTTTAAAATCX^CATTCGCTGAW 

27 665 TTAGCAATTTCACTC TTTCTCTGTAATAC CTCTGC TGAGT GAGATTAAATC C TCTATGTGACGC C C^TTAGTCTTACAAAATGTCATGCCA 

277 5 6 T AAAAT GC CAGGAAGGT CAGAAAT GAATTT C TCAC GGCC T GAGGAAT GAGGATT AT C C T GGGGT AACAT GCAGAT T ATTTTTC C CTTTATT 

278 47 TATTTATTTATTTATTTTTGACACTGAGTCTCGCTCTATCGCCCAGG^ 

27938 C C T GGGC TCAAGC GATT C TCATGC C T CAGC C T C C T GAGT ATT GGGATT AT AGGC GT GT GCCAC C GCAC C CAGCT AATTTTTGT ATT ATT AG 

2 8 02 9 TAGAGACAGGGTTTCACCATGTTGGC 

2 812 0 TACAGGCGTGAGCCCCCGTTCCTGGCCTATTTTTCCCTTTATTGAAGATC 

2 8211 GTAAAATTCTTC TGCTCATCCTTCTCAGGAC CATTTTCTC TTTCTTCATCAC CAGTAATTTCCCAGGAACCCAAGAAAC TCAGGTTT CCTT 

2 8 3 02 C CAT CAT AGTTGT GATTT CAC CAGT GAAT GC GAC C T GGC T CACAGT GCAGTT GAT AACACAGC T C TGAC C CTTTT AGC T GGACAGTTCATT 

2 8393 ATTAAATCTCAAGTCTACTCCATTC<:TTAAATCCATCTTCTCATTCACA^ 

2 8 4 8 4 C TTTTT GC TT GT C CAAAT GGACATTT GCAT ATTTCAAC GGT C CAGAAAGT GT AT CAAAC TGC CAAGT GAT GC C T AATGGC C CTTT AT GT C T 

2 85 7 5 C T C C TAGTTT GGAGACTT AC T CTTT GAAGAGGAGGT GGAACAGTGTTT C GAC C TAT GT CAC CAAGTC C T GCAC CAC TGCAGCAGCAGCATG 

2 8 666 GATGTCACCC GGAGC CAAGC CTGT GC CAC C CTTT AC C TC C TCATCAGGTTCAGTTTT GGAGC CAC CAGTGTAAGAGTT CAAAC CAGC TGAG 

2 875 7 TGAC CT GGAATCAGT AGAGAAAAATT GAT GT AAAGCATCAGC T GC GAAAAAAAATAAGGAAAT TTT GCAGTATT GCAGTTT AC TT C T GT C C 

2 8 8 4 8 TGTGAGAAAGAAACAATTGAGTATGTACATAGAT^ 

28939 C T C TAT CAGT GT GTTTCT AAAAT AGACAGC CAGGGGC CAGGAACGAT GGC TTT CAC C TATAATC C CAGCACTTTGGGAGGC CGAGGT GAGT 

2 9030 GGAT CATTT GAAGT CAGGAGTTCAAGAC T AGC CT GGC CAGCAT GGT GAAAT C C T GT C T C TAC T AT AAAT ACAAAAATAGC CAGAT GT GTT G 

2 9121 GC GCAT GC C T GT AAT C CAAGC TAC TT GGGAGGCT GAGC-CAGGAGAATTGC TT GAAC C T GGGAGGCAGAT ATT GCAGTGAGC C GAGATTGC C 

2 9212 C CATTT CACTCCAGC CT GGACAACAGAGT GAGAC T T CATTT CAAAAAATAATAATAAT AAAAT AAAAT AAC CAGGT GCAGTGGCTCATGTC 

2 9303 T GT AAT C C T AGCAC TTC GGGAGGC CAAGGCAGGCACATCAGAT GAGGC CAGGAGTC CAAGAC T AGC C T GGC CAACATGGT GAAAC C C CC GT 

2 9394 C T C T ACAAT ACAAAAAATT AGC C GGGTGT GGT GGCACACAC GCCC GT AATC TT AGCT AC TGGGGAGGC T GAGGCAC GAGAATC GCTTGAAC 

2 94 85 C CAGGAGGCAGAGGTTGT AGTGAGC CAAGATTGTGC CACTGTATTC CAGCCTGAGAC CCTGTC TCAAAAAAAAAAAAAGAAACv^AAGAAAT 

2 95 7 6 GGAAGAGT ATTTT AGAT T AAAAGTT ATCATC TGT GGGC^AAAAAAT ACAAT AGACAGGTTAGAAT TCAGAAGAGT GTTT C C TGTTT C T AAA 

2 9667 TT C T GAC T AGCT AGT GC CAGAATGAC CT GT GGAAGAGGATTTT AAAT GAT C GGT GTCAT CT AAC C TGAGT TT TAT T TT AAT ATTTT ATTT A 

29758 TTT ATTT ATT GAGACAGT GT C TT GC T CT GT CAC C CAGGC T GGAGT GT AGT GGCACT AT CAGAGC T CAC T GCAGC C TTCAAC TC C T GGGC T C 

2 9940 AGACAGC^n'CTTGCT GCATTC CCAGGCTGGTCTAGAACAC C TGAGCTCAAGTGATC TTC CCTC CTCAGCC CC CCAAAGTACTGAGATTAT A 
30031 GGCATGAGC CAT C C T GC C T AGC CAAGAC TT GAGTT TT AT T CAAAGC TAC GAAGACTT T GGAGTT CAGC T TTATT AT AGAACAGT CAAGTTT 
3012 2 GC TTTAGTTTGTC T AGATTTT GAT AC CTT C TTTGGAATT T C CATTT GT GGC CAT GTT AATAAGT ATGC T CAAGT GAT AT AT AAAGAT AAAT 

3 0213 TGGCCCATGGAAAAAAGTCAGCCTCCTC CAAATGT ATTAGGGATGATT ATTTAAAAGACATTC CTCAGGGGAC C TTGAGGTAGC CAT GTTT 
30 30 4 TTCCATC^QGCCTGTAAAGAAAGAAGAAACAAAACCTTGTTGCTTACCCGGAGTT 

30 39 5 TTT ATTT ATTTAGAGACAGGGTC T C GCT CTGT CAC C GTGGC T GGAGT GCAGT GGC GAGATC TT GGC C CAC TGCAAC CT C T GCC TAC CAGGT 

30 4 8 6 TT AAGCAATTCTC CT GC C T CAGC C T C TC GAGT AGC T GGAATT ACAGGT GTC CAC CAC CATGC C CAGC T AATTTTT GT ATTTTT AGT AGAGG 

3 0 577 CGGGGTTTTC^TGTGTTGCCCAC^GCTGATCTTCAACTCCTGAGCTCA^ 

3 0 6 68 C CAGAAGTC C CTTT CTTTT AATAAAGTTT AAAT AAAGTC C CAAGAAGAAAC T C TTGGCACAAAAGGAT AT AC T GTATTC TT GGAC C CAAC T 

3 07 5 9 TT ATAAGAATCTTC CAGC TT GCAGCACAAAGGCAGC C CAGT C C TCAATGAAAATTT AAAGGGAGC CT GACAGATTT ATGT GAGAGCAAT GT 

3 0 8 5 0 C CATTTAAAC CATTT AAACAACAATATGAAT GTT GT GCAAAGT GT AGC TC C CAT TT CATTGAGAGAAGAGGAAAT AATT AAGAC GGGGCAA 

3 0 9 4 1 AGGAAACACTGAGGAGTTGTTTGTGTCTC^CATC^TGCTTTCAGTTATCTACTGCTA^ 

31032 TTTCAC T GAT GCAGAATTTT GCAAGAGT AAAGAT GCAAGT AAC CAT GT C C C T GGCAT C TTT GGT GGGAAGAGCAC CAGAC TTT AAT GAAGA 

3112 3 GCAC CT GAGAAGAT C CTT GAGGACAATTTT GGC C T ATTCAGAAGAGGACACAGC CAT GCAGATGACTC C TTTTCC CAC C CAGGTACACC GA 

31214 AGCACAT AC C TT GT C TCAT GCAT GAGTTT GGGAT C T GC CAAC T ATT GT GT AT GT AT GT ATGT ACAT AT AT ACACAATTT AT AT AT AAAT AT 

313 05 AT AACATTT GCAAGT ATTT ATTGT C CAAT AT GCAT GT GC T C TCAGCACTC T GAGAGGTTTT AAAAAGAAAT ACAT C CAT CACTGTC C CCAG 

3139 6 TCTCAAT ATGCTAACAGT CT ATTTGGAGTGC TCAGT C TCAAAACAATTAGGAGGCAGTACAAGACAAGT GAT ACAT AAGT GCAAACTGTGT 



B 



% 



31851 TATGCTAGGGTCTTAATTAAGTGA 

3194 2 cCCTCTCGGGCAATGTTGCATCCTCATTCTCAGAATCTITTTCCTCTCCrCTAC^ 

32 03 3 TC CTTTACTTCCATCATTTATATAGAGGT^^ 

3 212 4 AAT GAAAT GACTC T AGT GGT ACAAATTTT AGGAGC C C TGGCAAGC T GGCAGAGGGGAAC GGGGAT AAGACAACATT CT GTGGC T GAGTT AC 

32 215 CTGCCAGGGTCTCTAGATCAAGCCATAGTCTCTCCCTGTTTTTGT 

3 230 6 T AAATGATT ACAAGAGGTT C T AAAAT CT C T GAAGC C C T GGGAAGAT C CAGGAGGCTT C T CAGACATGGAACTCAAGCT GAGGTC CTAAGCT 

323 97 GC TTCCTACTTGGTATAAAAATC C C T GAT ATT CCAGAGT AGAGTTT AGAAC TTTTCAGGTT ACAAAT AAC TGAAACTGGTTCAAAC T AATT 

3 2 4 B 8 T AAACAAAAATGTT GGAGAT AGAGAT AGT AT GAGGATTCAGGCAGATTTC T GGATC TCAAGGGC CAACACACATC CAGGT C TCATAAAC TC 

325 7 9 CTGGGCTGGCGAAGATGAAAACTACAGAGTCAGGT 

32 670 GCTTCCCCTTGCATTTCAATGTAATCCTAACTCT^ 

3 27 61 GAAC TT C T C T GT AAT CT GAAT AGCAT CTT AT ATGACACAGT GAAAAT GAGGGAATTTCAGGAAGATC C T GAGATGC TT AT GGATCT CAT GT 

3 28 5 2 ACAGCTAAGCTTTCCTGACACACT^ 

32 9 4 3 T AACAT GGAT AT AGT GATTT GGAT GAT AGC CAAAT AAT AT AT AGAAATTAAACATT CAGAGT AGGTT AATTCAT AT GT AAGTTTT CAGAAG 
330 34 GAT C TC C C T AATTTAAAGT GAGGCAT AAT AAT OTT ATTAAAT ATAAT AACATT ATT AAATAT AAT AACAC TAT C TT CT AC TTAC CACAGAA 
33125 T CAGAGAAAGAAT AATT AAC T GTT AT AGGAATTATTTTC CAT ATGC TTTTGTT C CAATTAT ATTCACACATACAT ATAT AGTTTT AC TTT A 

33 216 GAAATCATTTTTT AC GC GT AGTTT AAAAGTT GGT C TT C CAGTC CTT GTTCAGGAGAAATTAC TT ACAGAGGCAAAATT GTT CT GAT GCAAC 
33 307 ATCAT ACAAAGGGCAGT AC TTTT GT C TT C T GTTTT ATTTT GAGAGAAAGGAAAGAAAAGGCAGAAATTT GCC TGAGAGC CATT AAAATAGA 
33398 CATCATGTTATCAGGTATTTTTTCCCC^^ 

TT GGAC T AAT AAAT GTTTT C TTC T GT CT C GTTTT C T GGAAAT ATAGGGCAAAAT CT CAGGTGGAGGGGT AC^GGGAAC TC TTGGGGAGAAA 
AAAAGAAAAGGT CACACAAAGTAGAAGAACAGTGT CATT AAC C CAC T GTC C T CAAAACT AC TTC TCAC T CAAT C T GTC TT CAGAATT GC CA 

33 671 acagttaccaggcatctcctgatctgcggctgacctggctcc^^ 

3 3762 catgtgcctggtgcacgccgctgcgttagtggctgagtatctgagcatgctggag^ 

33 85 3 cAGGrTAG^TCTCTGCAGCTTTTCCOT 

3 394 4 AGATACATTTTTGGTTATCACAACTGGGATGGGTGAGT^^ 

3 4 03 5 AAT AGGAGAGCT C C C CT GACAAAGAATT GT C T GGC C C CAAAT GTC T GT AGTGC T AAGGTTGAAAAATT C CAAGTT CAT ACATT ACATTT GC 

3 4 12 6 TTCTTCT AATTGCTTTC C CATCGT C GTT GGGTTTTTTTT AAATTAC T GTTT ACAAT AAT GGCAC C TAGC CATT ATT AAT AGCAC TTT AGGA 

3 4 217 GACATTT GCAAACAC TTTCACATGCATGGCTTCATTTGAAC CTCC C C GTAAAGC TGT GAGGCAGGTAGGT AGGGAAGGC GGTT ATT ATTC C 

3 4 308 CAC TTC GC GGAT GAGAGAAC T GAGAGAGCAAGTTTT C TAAGGTCAC TT AAAC TC TTTTT CAAAGACTTGT AGTT GACACAGTAT AC T GACA 

3 4 399 TT GT GAAAGTTT GGAAAACATTGGAT AAAT GATTTT C CT C C T GGC C CATT CATTTGATTCCACT C TTCAACTTT AT AGGGGC C CAC TCTC C 

34 490 AATCCAAAAATCAAG^WvGAATCAAATC 

3 4581 CATTTGGCAAAGGGAAATTGTCTGCCAGACCTAAAAG^GGC 

3 4 672 TAATAGTC TTTAGAGAGAGAGAAAAT AAAAAAGCACAAT GTTGGGTAC TTTTTTTGTAAGAGACATAGTTTGT AGAGATGACC^ CCTGA 

3 4 7 63 AAC CAT GAACAAT AT AGC T ACAGT AATAGAGT GTTTTTCAAGC CAGAC TCAC GAAGTCATTT ACAAGGGTTTGT ATTATTC TT GTTT GAAT 

3 4 8 5 4 TT ACAT GGC T GATTTTAT GAAAAGCTTT GTT C TT GTT ATT GTT CTTCAACACAATTTTGTGAT GTTGT AT GAAC CAGAAAGAAAGAACAAT 

3 4 94 5 T CAAAGT AGC TTC C C CAGGC TTAGAGAAT AAGTCAC T GAAAC TAT GC T GGT GCAGC CAAGAGC TT CT GGTTTTC CAGAACACAGCAAAGC T 

35 03 6 GGGTATTGC C TC CTATGAAT AACTC CTC CTTTCTTATGCTC CTCAAGAACAAAAATACT 
3 5 127 TTCTTCTTAATTCAGAAATGTTTGTTATAAAAGCTGATAATTAAATCTCAT 

3 5 218 AAGTGTACAAGAAAGTTATAAAAGTTATTTTAGATGTATTTGCTTCTTTTCTCAAATTT 

3 5 30 9 ATT ATAGT AC TAC T C TTT CAATT C C CAGGAAATT GTAAGGTTT AGCAC TT CAT ATT GTTTCATTT AC TAAATT ATTTT AT ACTT C TTTT AT 

35 4 0 0 TC C TTTTC C CAT GACTAT ATTTT ATTTT AT ATTT AT CAC TTAAA^ T 

3 5 4 91 GTTC TAATTTTAC CT AGAATTCAGTT GATTT GCT AAT AAT GACAT GC CAAAGTGAAT CATT ATT ACACAATCAACAGAAATAI^ 

3 5 5 82 ATT C CGACAT GGGGGCAT ACAGC T C TAT C T GT TCACAT AT ATTT AT C CATT GATTC C TT CTTTT AGAGAATATTT ATT GAATACTT ATT C T 

3 5 673 GTC C TCAT GAAC GTT ACATT C T AAC CAGAGAGAC GT AAT AT AACT AATTATT C CAAT C TTT GTT CAGTTATAATTATCAGAAATACTGTT A 

357 64 T AAAGAGGCACAAGATATTGT GAGCATTT AT GGT C CAGGC C C T GT GGGTCAGGGAAGGGTGAAGAAGGT GAAAAGGAAGGCAGAAGAAAC T 

35 8 5 5 GAAGTGT GAGGGC TT CTT GAT GT AGAGGAGGCAAT GAGTT AGGTGTT GTCAGC T ACAGAAGAGAAGC CAAATT AT ATT AAT GT GT AT GAGT 

3 5 94 6 GAATTC TTAC TT C T C TCAAAT GGGACAT AC CAAAT CAATTT GGAAAT GT AGC T GGCAAGTGAAAGGACAC C CAAC C CAGAC TGACAGAAGG 
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GAATTATGGGTGATTTCTTCCCCTAAATCAGGATGCCTTACTTTATA 

TGTTGAlTTTG<rAAATGCTCTGTGCTTTAATTTTCAACCTTGTTCTC^ 

T AAAT AAAT ATGT GCAAGGAAAAT AACTTT GAGGT CACT GAATTC CAGGAAAC T GAGAT CAC T GAAATT C TGT GT C C CAGAGT GCAATATT 
T ATTTCACAACT GT AGAT AC GGACACATT C TT AGAT ACT GC T GTT AC TTGT AC C TC C C T GAT C C TGAAGC CAAGAAAGTC T GCAGAATC C T 
TTTC CTC T GACT ACAGC T AAT GAGGC TACAT AGC TT CAAAT C T GTT C C TC T AAT GT GGAAAATT GCAT ACATTC T AAT AGT ATT AGT AT C T 
T GGGTT AT T AAAT GC TTT CAACT GAATTT C TT GGAT C TT C T GTTGT CACAGAAACAT CATAAT ACAT AGGGCAGGTTT GGAAGAAAGAC T G 
(^C?J^AGGCTTTGAGAGC C TC CTC CTATATTC CTAAAAC TACGTTACAGT ATTGCATGTTGAAGAGAT AGGGCTATCT ATGACA^C TATG 
TC C T GAC T GAIT GC T AAGGTT GATTCACATGATC TT GCT AAC CAGGC CAGAAGGCAGACAGC TTTTAGTT CACAAGC CAAC TCT GAT CAGT 
T AGT AGT GGC TGAC T GGAGAACT AT GCTT AAGAATTT C GAGAC TAT GT C CAAGC TC T GGGGAAAAAGT GC TACAGTTGATT AGTT AT GC C T 
GC CATGATT ACAGCAAT AGGAAGGAGTGGCAT GT GT GC CAC C T GTTT GT AAT C C CT AAACT GGGAAGGTTTC C CATTT C TT CT GTTTTT CA 
TATGCATTTCTTC CATAGCT GTGAGC TAGGAAGAAAATGATTCTTGAC CT GTCACATATTCAC TGCCAGGGCCAGT GCTAGGGT GAAGAGG 
CACTCAC C CTCAGGGTCGAGC GGGT ACAAGATCAGT ACTTCCATGGCC CTAAAAGC GAGTACCTCTCTAAATTTTGTCTTGGGTTTCTCAT 
TTGGTTCACC CCAAC CATGGTCTC TGCATGCTCTGC T AGAGGC TCT AAAC GGAATAGTTTATGT AAAGGAAACAAATGCATGGAAAGAAAA 
T GTT CAGGAAGAACAAAAACACACACACAGT AAC T GC T GCAAT GC CAT GAAAAC TT C C T TAAT GAAGACAGC CTC GCTT GC TGTT GT C GT A 
T GTCAT GGC T GTTT ATC T GAGTCAAC TC CAGAGT AGCAACATACTT CAGAAAAACAC CAC T GT AAGT CAGAGGT C CAC T C GGT GAAACAGG 
GAGC CT AGT TAAT GTT AAT T GGGTC T TT GC C TTTT GAAAAC CAGGACAC CAGC C CT AT GTC C C TT AGGGTTGTTT CAC T AAAGT AAC TCAG 
C T GTTGT GACATT GAGGT AAGTGT C C TTT AT ACAAAATC TC C TAAT GGTT AAAAAGAAAAAC GT GAGGTTTGAAGAC CAGTT GC T CAGT GC 
GC C T CTT C T AAAT GAAT GGCAGACAGAT AC T C TC GGGGT AGAATT ACAGAC C T AGTTT AGT CAC GGT C TT GGT AAGGAT C T GCACAC CAGC 
TTCCTCGTTTCCCCATTCGGOTTCCTGTGGTCTC^^ 
T^CTTCCAATCTGCTGGAGC^GTCTGTGGTCTCTGAGGACACCCTGTC^ 

AGTGGCCTGGTAGGCrc^ 

CAGC TGGAC TTGGGGTGC T GGGAACAC C T GGT CTT AATGGC C CAGT CAGC C C CACTT C C CGAGGACAC GT GC CAGGGT GT GC GGGGCAGGG 
GATGGGCCCG<jGGAGGACTTTCATGTATGCAAATT GCAT GAGC TTC 

T CAC TGT AT GCT CATT GGTT GGGCAGCAGTTT CACAGTATT ATTT C T ATTT AAT AGGGGTGGAAC TAAGC CACAGAGAGGT GAAAT GGC C T 
GC C CAGGGTT ACACAAT AAAT GAT GAGGCAT GTTTT CAC T C C C TC GTTTTT C C T CT CAGAGAGAAAAAAATT AGGGAGGAAC CAC T GGGAG 
GAGAGAGGAGGAAT ACACAGACAGT GTC TTCCCTCC T AGC CAC TGT GCAGT C T GAAGGAC CAT CACAGAC CAGGAC CAGC TT ACAGAAAT G 
TGGGCACAGAAAC CACT GAGACT C C T CT GGTT AAC GT AAT C T GGAT C T AAACAC TC C TACT AT AT AT AC T AGAAAAAT AT AT AGAGAGAT G 
AAGT CATT GAGATT CAGC<XIAAAGAGGAAACACT C TT GT C T ATTTT C TTTT C TTTT TTT GAGACAGAGT C TAGC TC TGT C GC C CAGGCT GG 
AAT GCAGT GGTGCAATCAT GGCT CAC TGCAGC CTC T AC CTC CAGGGTT CAAGCAAC T C T C C T GC C TCAGC CTCAT GAGT AGCT GGGATT AC 
AGGTGTGCACCACCACACTCGGCCAATTTTTTGTATTmAG^ 
CTCAAGTCATCTGCCCATCTTGGCCTCCCAAAGT^^ 

CTC CTC C TTT ACATTTT AAGC CAAGAAAGT ATTCAGT AC TTT ACT AT ATTT AGC T GAC C CAATTT TGTTTTCAT C TAT AC TAT AC T CAT C C 
TTATTTTCCAGTTTTTATTO 

C TTTTCAAGATT CAGAAAAT GTC TAAT AT AC T CT CATTTTT C C TCAAACT CAACAAAAT GAATT AGAAT C CT AC T AAC T C TTT GGAGGCAT 
ACATTT AGCATC T GGCT AGAGGAGGAC C T C T GAT GAAATTT AAAT AT AC T AAAACT GC C TTT C T GAATT GCT GTT AGT C C C TGC T AC CAAA 
CTTCTCTCCTGTTTTTTCTTTTCGTTTTGTTT^ 

GAT GCAGC CTTGGC T CACT ACAGC C TTGAC CTCCT GGGC TCAGC C TC C CAC C TCAAC C GCC CAAGTAGC TGAGC^T ACAGGAGCATGCT AC 
CACACCTGGCTGATTTTTTAATTTTTTTGCAGAGATGGGGTCTCCCTATGGTGTCTAGGATGATCTGAACT^ 

CTGCCT CAGC CTCCCAAAGT GCT GGGATT ACAGGCATGAG^ 
m _.._^^r^TnrTTi i r^r-nr/rrr,TrTanrTr.Ar,TTTGMGGCTA^ 
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3 66*7 4 TATTTCAi 
36765 
36856 

36947 GCCACAAAGGCTTT 
37038 
37129 
37220 

37311 TATGCATTT' 
37402 
37493 
37584 
37675 
37766 
37857 
37948 
38039 
38130 
38221 
38312 
38403 
38494 
38585 
38676 
38767 
38858 
38949 
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3 9404 CTTTTCAAl 
39495 
39586 
39677 
39768 
39859 
39950 
40041 
40132 
40223 
40314 



ITCTCTCCTCTTTriTCTTTTCGT 

GAT GCAGC CTTGGC T CACT ACAGC C TTGAC CTCCT GGGC TCAGC C TC C CAC C TCAAC C GCC CAAGTAGC TGAGC^T ACAGGAGCATGCT AC 
CACACCTGGCTGATTTTTTAATTTTTTTGCAGAGA^ 
CTGCCT CAGC CTCCCAAACTGCT GGGATT ACAGG 

TT ATCGC CATACAGGACTAC TTAGC GAG<TTGTCTAGTTCAGTTTGAAGGC T AC CACT GTCC CAAAAGT GCTCAGAT AC C C CTTCTTGCC C T 
GTGAAAT AC T GTGAT ACAACAAT AAATTCAC TCTC CAGGACATTGTTT GGACAATGAC C TC T GGTTGC T C TTCTT AAGTTT CCAGT GGATT 
AAATTCTCTCTGATGCTCTTCTCCTCTTTCCAAGGGAGGCTTATATGAGAC^ 

GC GCATC GAGAATT C C GGAAGCT GACAC T CAC TCACAGCAAGC TGCAGAGAGC C TT C GACAGCAT C GTT AACAAGGTAGC C GGGGAGC C T G 
4 0 314 GCTGGCAGGTCTTGTTACCTOTGGCAGGCGACCCTCTCCTAC^ 

4 0 4 0 5 ATT GCTTTT GCT C T CAC C T GT CAAACAGAAAAGGGC T GAAATT CTT C T AACAGAGGAC CAAAATT C CAT ATGT GAAAACAT ACAGC TTAAA 
TTACTTT ATAAC CAGGAAAT GTGAGAAATTTTTAAGT GTAATT AAAAGAAGTC C CAGAAATCTTT CAT GGGATT C C TTTTGTTGTT A1TTC 
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40951 TCATCACCAAAAATAAACAAA^^ 

410 4 2 GAAAGGT ATT AGGT GAGAT CATGAGGTGAT CT AAT ATTTT GAT AGTTTTT C C C T AAT AC TC T GTGTAT GC TTTTCACAGTTTGGAATTTT A 

41133 TAT GGT GAAT ATTT ATTTT GAAGT C T GT GCAAAATT CAAT CAAGGTCATGT GC TTT C TT AT CAC C CTTTC CAAATATTAGTAGTTTATACT 

4122 4 ACT AGAT AGAGACT ACT ACAGTTTTT CAACAT GAAGTTT AGCATC TT GAC TTT GAAGT AATC T AGC CAAATGAC T GAAT CAC C C C T AGAT A 

4 1315 AT GGTGAGGCCATCCTTTAGGTATCACTC^TGG^ 

4 14 0 6 GC CAGT GAGT GACATTT GGT C CAT AATACAACAGAAAAGT GT AGCAT GTT GTT AGAAGCAAAGC T GAAAGCAT GGAGAAAAAGAGAAACAG 

414 97 CCTAGAGAAAGGTCAGGACAAAAGAAACACAGGT^ATAAGGGCC^ 

415 8 8 GAACAGGAATTT GTT TACAAAGAGC C TGCAATTAC CAAGCAGGACTC^GGGTC CTTTGAACTGCTCACAAT^ CAAAGATTCACTCCTTA 
4167 9 GGCTTTCATTGCTTTGAATTTCTGCCCTCCTATG^ 

4177 0 C CAAAAAAGACAAGAGGGTTTATC CTTAC TTGAC TCAAAGAAC^CATTGCCCA 

418 61 TGAGAAAAAGAGC TAGCAATT GAC C C CAGTC CACTTGGTTC C C TATATTGTTTCCCCTACTTAGAATTAGAGCT 

41952 GT GCAC T GT GATTT GC CACAAT AT GGGAAGGC TGGT GAC TT C CAAGTT C C CAGGGT ACAGAAGGC GAGAAGT AAAGAGT GT GAT AC C CCAG 

4 20 4 3 GAGATAC C T CAGCAAAT AT AATGAT GTT AGC T GAATT AGAGGC CAAGCATACAT CTT AT AGGGAAGCAT ATAC CAGTT GACAGT GC T ATTT 

4 2134 TT ATTTTTGT CTT AGGAAAT GCT GAACTTT GC TCATT AGC T CAGAGGAAC T C T CAAT AATCAGT GAACATCATTC T AC C TTGCAC TT GCTC 

4 22 2 5 CAAACTT ATTTCAC TTC CAAGAAGACAAAGAGTT GCATT AT GTTAAAATAAC C TTT AT AAAC T GTTGGTT CTT C TT AC C T AGGCATTTT AT 

4 2316 GGT CAAT GT TTT GGT GCAGAATT T GT GGAAGT GATT AAAGAC T C CAC T CC T GT GGACAAAAC CAAGTTGGAT C C T AACAAGCT AT ACAAAA 

4 2 4 07 ATTTACAAAAACTAACCATCAAGCTCTAAATCCCTTCGTTCTCTACCCAAGAAT 

4 24 98 TT AAATGCAGTCAAACC TTTT C GT C T AGAGTTCAAC T AC T AATTGGT CAGATC TTAAAGAAAAT AT AGT CAAAGGCAGGAATCAT AATAGG 

4 25 89 aGCTACCACTTATTAAGCACGAACTGTGTACCT^ 

4 2 680 AAGTAGATTTTGTCATTC CT ATTTTAGAGAT GAGAAAAC T GAGACAT GGAACAGTCAAGCAAGTTTCAAGGT CAT GCAAGCAGCAGAGC CA 

4 2771 GTACTCAGACTTGAGGTCTGTTGCTTCTGAACCCCTACTCTTCAGCACTG^ 

4 28 62 CAAACT C GACAACAC TT AGT GGC TT C C T T C TTT AGGC TGCAAGTATT C CTT C CATC CAAGTC C GGGATCACT GT GCTGTT GGGGGAATGGT 

4 2 9 S 3 AAAAAC GGC TTGGGTTT GGGTTT C C T CAC TTT CACAAGAGGGT AT GT T C CATTT CAC C C CAAAAT GGGT GTACAGTTC T GATGC T AACAC G 

4 3 04 4 TGGAGGTGGTATCAGCTCCCACAGGGTAAAGGCTCAGrcCT 

4 3135 C CAGCCAC GC TTCTGATCAGC CAGC T ACAAATTT GGGAGTTT C TAT AAC C T GTT AGC TT GAAAAT AGGAGAAAACAAAGCAAAT AAT AAT A 

4 322 6 AT AATAAT AATAAT AAT AATTTGGGAGTTT C TAT GAT AC C T GTT AGATTAAAT AATT CACT ACAATGAC T CACAGAATT C CAAAAAGTAC T 

4 3317 C T GGTT AC T ATCAGAGT ATT ATGAT AAAGGGT GAACATC TTTTTGT GC CTTT ATTT AT GGC TT C C TGGC T CAT CAGTGT GTTC T C CAAC CA 

4 3 4 08 GGAAGC T C C GC CAAGTC T CAGTGT C CAGAGATTTT GTTGGGGTTT CATTAT CT AC<XAAATTGAATACAGTC TC CAGC C C C TC TC CT GT C T 

4 3499 CCAGAGCTCATCCAGTTCCTACCCTCTAATCACAGA^ 

4 3 590 AGTCACCTCATTAGCATCACAAAGACACCCATCACTC^GG 

4 3 681 GACATATTC TTT ATT AT AC CACAAGC TGAT GC TGC C CAC T CAGC C CAACT CACATGT C CATGAAT GAGC TTT C T AAGTT AC TGGAAATAGT 

4 3772 GCAAGTGCAAGGTAT CATT AAGGGC C CT GGGACAGAGGAC CAT TCAC CAT C T AGCAAAC CT AT AAAAT GAAAGGT C CAAACTC C CAGTT C C 

4 3 8 63 AT C TTC CAGGAT AT GAAGAAGAT AAT GGAAGAGGGGiAAGAATT GGC CAAATT GAAGAGTTT GTTTTT C T ACATTTTTCAGAAT GC TTTC TC 

4 395 4 AC TT AAGACACATT C C C T AGC CT C GGCTT GAAAGCAGTGGC T GTGGT AAGAGTTTAAC T AAT C TT CAGACACACAT GT C T GGGAGAT GGAG 

4 4 045 JTGGCCCTCTC^CCACACTGATCTCTACATAGCACA 

4 4 13 6 ATT C CCAGGGT AGGTTGT GAGAGTTT GGAAT GAGAGT CT GAAC C CAGAGTT ACAAC CAGATTT CATT AT ACT AAGT CGT GATTT ACAGT C T 
4 4 227 GAGGTCAGT GAC C C CAC T CAT C C C TTTCAGT GGGT GAGT GT C C CAGCATC T GAACT CAT GGT CAC TT'IT'TTT C C T AAGAGATT GT C GTC TT 
4 4 318 T AAT GAGT AC TT ATT C GC T C T GT GT T T GAGGGTGGAGGT GAT GT GAGT GT GT GTTT AAATGAGGC CT AGGCAGT AAAATT CAGTTTT GGT G 
4 4 4 09 TTT AGTT C T ATGAGTTTT GACAAACACAT GT GAC T AC CTTCAT AAC CAAGATAT AGAACAGATC CAC CAC TC CAAAAT AC TTC C C GTGC C C 
4 4 5 00 ATTGGTAGTC CAT GTTGCAAT AGTTC CTTC C TTTC C GTT GCTGAGTAGTATTC T GTT GT GTGAC T AC C TCAC CATTTC^TTGTT AATTCCC 
4 4 5 91 CAC T GGAGT C^CATTT GGGTT GTTT C T AGT C TTT GT TAT GAAT AAAGTTGC T GGAGACATTT GT CT ACAGGTTTTT GT AT GGACAT AAGC T 
4 4 682 TTCATTTC T C TC C T GT ACACACAT AGGACT GGGGTT GCT GGGT C CAAT GGT AGT GCAGTTT AAC T GCAT AAGAGAC CGC CAGC TT CTTC TG 
4 4773 CAAAGT AGC T GT GCATTTT GCATT C C CAC C C T CT GT GTAT GACAGC T CTAGC T GAT C TC CATC CTTGC CAGCACTT GAT ATTGTT Au' ri u lT 
4 4 8 64 CTTTAGTTCGGCCATTCATGCTTCCTCATAACTGTC^ 
4 4 955 CAAACTCTCATACCAAGGCTTCCACTCTGC^TCACT 
4 5 0 4 6 CCCTCTCCCTCCTTGCAGCCCCTTCTGTCCTTGGGG<^^ 

(jc,] 17 nr. ArTOGr^" Arr/v ^-CT ^ r A r r -^ r CT CTT'" <~ T ^"AT ^ n ^ ^ ^ ^ ^ T-.^ ^ ?■ ^—G- T r CT^' A G A - " * m <~ r. ~ <- -^r r^r ^ ^ 
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4 5 501 GGATTGGCCACCACTGGATGAGT 

4 5 592 GATGCAC TT GAAAT ATGATT ATAT GGTT GT C C TTT CACT C T CAT AGT GCCAGAAAAT C C CATTT AGGCAGCT ACAT ATTTT AT AGCACT AT 

4 5 68 3 T CT ATATT AAAT ATTT AAT ATTIT AAAGTT AT AT C TTTAGT AT AAGT GTGTTT ATGT ATTT AATT AT AAT ATTT AATAT AT ATTT CAATT A 

4 5774 T AAACTT GT ACAACAGAT AT ATTT AC CTTTT AAAT ATTTT AT ATAAAATTTTT C TAT ATTTCAAAGC TT AGAGGT GATT CAAGCAT AGT C G 

4 58 65 T GC T GTT AATT ATT GGAGAC GGGAC C TGCAC GTGGGCAGC C C CAGTGAGGC GGT GGT GT GC T GT GGGAAAGGGC C GTGT AAGGT AGACAAG 

45 95 6 TT CACT GAC T AGC T T CT AGT C CT AGC TT C C T C CT GT GATTT T AAACAAGC T AC GTAC C TTCAGTTTC TTCAT C TAT GCATT AGCAGGAAAG 

4 6047 ACCTCTAAGTACAGTACAAGATTATACTCAT^ 

4 6138 ACACACCATAAACAACTTATTATTCTTTA^ 

4 6229 CT GT GGGTTT GC T GATCACAATT CAGCATTTT CTTTT AT AAAC CACAC C C GT AGTGC TTTGT C CATGATTTT CAGTTTT GC TTT GT GTAAG 

4 6320 CAGAGT GAGAGC TT AAAGAT C CTT GTTAAACAATT T GAGAGCAGAAGC CTT C T GGAT GTTT AT GATGTTTTT C T C C CC GAGAC TTTGACAG 

4 6411 CAGT CTTGT GCACAC CT AAT ATGACAC^iAATTTTT AT AGCAAC TCAC TTT CAT AAT AT C TT GT C CAAC CATTT GGC TT GGTTTTCAT AGAA 

4 65 02 AGAAATC TTTTT C T TTC CACAC C CAT GGTTCATCAGTTT C T C CATT AT CT AATT AGATT GGGT CATT AAAAT AACAAGT AT AACAGGCAT A 

4 6593 AT CAAGTT GGTGAACAAACACAGAT GAAT T GT GGT GAAT AT AT AC C T CAT CAGGCAGAAGCAGAAAT AGC TGAGC T AAC T GGAGAGT CAT C 

4 668 4 CAGCAGGT AGTGTT CAGC T GGGT AAC CACAT CAGGGT GT GGC TTT CACAGAAGC TGAAGAAAGC C TGAAGCAGT GAC T CAGTC GGGAGGAG 



4 6775 GTGGGTTT AGAAGC CAT C T GC TGT AC TTC C T ATAC TT CT T GGTTT GGACTT TT GAACAC T GAGACATT C T AGCXLAT AACACAGAT AT AAAC 
4 6866 
46957 



4 68 66 TCATC^GTGCCCAAAAGCATTTCAAAACCTTTTCTTTTCTTCATTCCTGCCTTCCTTTCCTTTC 



TCTGTGTCTCTGTCTCTCTTTCTCTCTTTTTCMTC^ 

_ _ ^^^^^^^^r^^^r^^rr^^rrrTrrrCTnTar^Trr/iiirTiirarj^TTCT 

4 / U 4 W ClXjC-/UjF*_L 1 l_LlM^^. 1 <^ J- VJVJVJ^ J. ^VJ^^-W^J. J. A A V,^ A -TV-*- AW W^- V. — 

4 7139 ATT AAAAAT ACAGAGAGAT AT AT ATTTT GT AAGAGACAGGGGT C TT AC TTT GTT GC C CAGT C T GGTC T CAAAC T GTTGGC C TCAAAC GAT C 

4 7230 CTCCTACCTCAGCCTCCCAAOTOTGGGATTAC^^ 

4 7 321 A cctcaacaacagttctcac^tgtttc<:tagggctgtac^ 

4 7 4 12 CAATTCTAGAC^TAGAAGTTTGAAATCAAGGTG^ 

4 7 503 TCCCTGACCTCTC^CAGCCCTCTTCTCCCTGTGTCTOT 

4 7 594 CAC CAGT CAT ATTGGACTAGGC^TTACCCTGCTGGCCTCATTTTAACTTGAT^ 

4 7 685 C T GAGGT AC T GAAGGTT AGGACTT C GACAT AT ACATTTT GGGGGAACACAATT CAAC C T AT AAAATT CAGAAAAGACT C T AC C C CAAAC CA 

4 7 77 6 GCAGAAC TTAGCAAAT AGATT GAT T GAC C C TT AAAAGAAT T C CAT TT ACT GGAAATT GAC C C T CAGTT GGAGAAGGCACAGGT GAT ATCAA 

4 7 8 67 AAGC CT GT GTT AT GATGGGGGAGAAAAT C TT GAGT GC TGT GC TTC T AC TACAGC TTT C T GCAT T GTAAGTTGAGT AACAT GAGGC T GTGT G 

47 958 C GGT GGC T C TTGC C T GT AAT C C CAGCAC T TT GGGAGGC C GAGGCAGGGGGATT GC C T AAGGTT GGGAGTTTGAGAC CAGC C TGGC CAACAT 

4 8 0 49 AGAGAAAC C GCATC T CT AC T AAAAAT ACAAAATT AT C TGGGT GTGGT GGT GCAT GC C T GTAAT C C CAGC T AC TT GGGAGGC TGAGGCAGGA 

4 814 0 GAATTGC TT GAAC C C GGGAGGC GGAGATT GCAGT GAGC CAAGATCAT GC CATT GCAC T C CAGC C T AGACAACAAGAGCAAAAC T C CATC TC 

4 8 231 AAAT AAAT AAAT AAAT AAAT AAGT AAGTTGAGT AAC TTGC TCACT AAT AGGAAAAC^CACC TGACAGGCT TTTGAA 

4 8322 GATAGACAT ACTT GT GAT C GAATT AT GAC C T C CT C C C TT AC T AC C T GT GGCAT T CT GT GGAAC TT AAC C TT ACAAATC C C C CTTTGACT GT 

4 8 413 AAAATGGAGAGT AT AAAGAGGTT GTT GAGAGGAAT AAAT GC T AAAT GT AT GT AAAGTT C CT GGAACAT AAGGAAT CAAGAAAT GTT AGGC C 

4 8 5 0 4 CATCTTTCTTTTTAACCTGTTAAGAGTATTT^^ 

4 8 5 95 TAGC TC CAGGCTT GT C CAT AGCTT AT GAGAC CAGACAGT GAC TTC C C TAT GTT T AC GT C TCAT GTTCAGTTT GTTTTGACAC C GATT GAAG 

4 8 68 6 TT GC CATT GAAGACATGAAGAAGAAGAC C C T GCAGTT AGCAGT TGC CATT AAC CAGGAGC C GC C T GAT GCAAAGAT GC TT CAGAT GGTGC T 

4 8777 GCAAGGC T C T GT GGGAGC T AC TGT AAAT CAGGT AAGCAAAAC CAGAGGTGGCAGC T C C T CT GGTT C TT ATTATT T AGGTT GTCATT AT AC G 

4 8 8 68 TCTGCAC C C TTC TTC CTTGGGGTT GATGAGGACTTT GATC CAT AGACAAACACAGAAAT GTT C C T ACAC TTAAC CT GAACAC C TGT AAGGT 

4 8 9 S 9 TT AGAAGAC TTTT AGGAAAC CTCTTCCCTTT CAT GT AAC AAC C C CAGCT AAAAAAGAAATC C T AGAGAT GAGT GGAC CAGGCTTT AAGGAG 

4 905 0 T AC CAC TTT C TCAGAGGAGT C TC CAC TT C GGGGC CAGAC C T GACAAT ATGAT GCAAAT C TGGAGT CAT GTTGAAGAAT GC TTAGT CATGAC 
4 914 1 CAATTCAC GCAGAGT AATT GCAGGGC TT GAGACT CAC C T ACAAAT GC C T AT AGCAGAGAAGGAAAAGGAT CT AGAACAT C CAAC T C TTGGC 
4 9232 TCAGTCAGCAGAT GAAC C CAGCAT GC CAAGGAC CTT GACAAC CAGGAATGAC C T GGGAC CT GAC TTC TT AGGC T AC TT CAGACAAGACT AG 

4 9323 ATCTTCCTATCAGACTTCTTAGAACOTGACTCCCAAGTTCAGCATGGT^ 
4 9414 T CTGTCCAGTGGC<:CTCCTTAATTCAAATCTAGATCTGTTCTTCTCCACACTT 

4 95 05 CATTCAGCGTAATTGGATTTC CC CACAAAAGTTCT GAGTGT AC TTGACAT CAAGGGAGCAGAAACAGAGAAGAGAAAT GC C T ATT ACATT C 
4 95 9 6 C CAAGAT CAGGAAAAAAAAAT GAGGAAAC GTTTGC C TTT GT AAGT GC CAAT C C TTT GAT AAAAT GGAACACTTT C CAAGC C CACAAC CATG 
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50 051 GTGTGGTGGCAGGCGCTTGTAGTCCCAGTACTTGGGAGGCTGAG^ 

5 014 2 GArTATGCCACTGCGCTCCAGCCTGAGTGACAC^GTAAGACTCTGTCTGWW^ 

5 0 2 33 CTGCGCACGCCTGTAATCCCAGCAATTTGGGAGGCTGJ^^ 

50 32 4 AAAATC C CAT CAC T ACAAAAAAAT ACAAAAAT AGC CAGGT AT AGT GGCACACAC CT GT AGTCT CAGCT AC TT GGGAGGC T GACAT GGGAGG 

50 415 ATC ^CTTGAGCCTG«^GTGGAGGCTGCAGTGAGCAGAGATCA^ 

5 0 5 0 6 AACAAAAACAAAACACACACACAC^^ 

5 05 97 TTC C TGAGAT GT C CAAGAC C TTC C TT ATTTTTTTTTGTTT ATTTCAAAC CAAC T ACAAAATT AAATGTTCAAAC T AAGAAGTCACT C CAAT 

5 0 68 8 ATTTTCTCCCAAGCAATTTAATTTCCTT 

5 077 9 ACAT ATACAT AT GCATAT ACATAT AT AT GT AT GT GT ATAT AT ATAT GT GTGT GTGT GT ATAT AT AT AT AT ATAT AT AT ATT CTTTTTTTTT 

5 087 0 TTTTTTrTTTTTTTCCCAC CAGGGAC CACTGGAAGTAGC CCAAGTGTTTTTGGCTGAAATTCCTGCTGATCCAAAACTCTATC GACATCAC 

5 09 61 AACAAGTT GAGGTT ATGC TTT AAGGAATT CAT CAT GAGGT AAGAAGGAAAAT GGCT GGGAATTT CAGT AGAGCAGT GGTT C TCAAAGTGCA 

510 52 ATCTTAGACCAGCAGCTTCA^ 

5114 3 GAGGTGTGGCTCAGCAATGTGCAGTTTAACCCGCCCTCCAGGTCA.TCCTGATGC^ 

512 34 T AC TTGAAC TAG T AATGAGT AAC T AACAC GT CAAC T ATGAAAC GC TTTTGT GGC T AGCATC C C GT GT GC C TCACAATCAC TTGTTGT AAAA 

513 2 5 CAAGTAT CAT CAT C TTC C T CATTT T ACAAAAGAGGAATCAGAGGTT CAGAGAGAGGAGAT AATTTTAC TT AAGGTCACACAGACAGTTGGC 

514 16 AGCAGAGAT GAGC T CAAAC CAGGT C TTC T GAATC CAAAT AGT C CACAT GT C CAT CAAT GTGC T GAT ATT C CAC T GATGCAC T AGAGT C C CA 

515 07 GAGGTT C TTT GCATT GGCAGT CATT GT AAT GTT ACAAAT C TT AT AAT ATC TT ATTTTT AGAAAC TTAAAAACAT ACAC T GC CT CAAATT GA 
„„,»™^ m mBi,T,rr7vrr^TT^r7\^nr/rraTTaTrrTCTr.nriif7TTrrrrATrrrrAn^^ 

O 1 0 y O (jM.U/UjO^--^/AVj>.l i ^ i/wirk^,^ ^^v, ■ — 

51689 GCAGTGGAGC TGGC TTCAAGAGCAGT CAGTT AGGAAC TGGGT C TTT GGTCAAAGC C TT GTAATTT ATT GCAAT ACAGACACAT AT ATTGGG 

5178 0 TCTCCTATGGTCCTAAAAATATGAGTATGAAGGATGTGATGGCATTTCACCTATAAGAGAAAAG 

51871 T C TTTC T ATT CAAAAC C TT AGAGAAGAAT ATT AAGT ATAAGAAT AT C TTCAC TTGGCAGAT GGGGAT AAAGGAGAC TAAAAGTTT GC TAAA 

51962 TT CAGT GAAAAC T CATCAAAATGC T AGCATTT GT GAGGTT AGAGATATCT C CT C CTTCACAGGC TTTT GC GAGAGGTC C GCAATTTT GC T G 
52 05 3 CATATACAGGAGACACACTGTGCCCTGCAGTGAATCACAGCCG^ 

5214 4 GT GT GT GAACAT AC T CACAGATT AT C TT GATT AAT GTTTT GAT AT TT GAAT T GTTT C C T GT ATTT GACAACAATT CTGAGAGAAAAACT AG 
5 22 35 TTGTTGTTGTTGTTTTTTCAGTACATAATACGTATCTTTTAAAAGGCTAGAAA 
52 32 6 GAGACAAGGTCTCACTCTGTTG^CCAGGCTGGA 

5 2 417 CCTCCCTCAC<:CTCCCAAGTAGCT&3GACTACAGACGTGTGCCACCACATCTGOT 
52 508 ATGTTGGCCAGGCTGATCTTGAACTCCTGACCTCAAATGATCCACCTGCCTTC<5 
5 2 5 99 T GCCCGGCCCTATTTTTTATTTTTTGTGGAGACAGGGTCTCACTATGCCGTC 

52 690 TT AAAAAAAT TTTTTTT CAT AT ATTTTT ATTT AAAATTCATT ATGGTT GAAT GC TT C CAAAGTT GAC TAT GC C C CAGGAC CTC T AAAAGGA 
5 2781 C C T ATGAAAT GTTT GAAGAGCAC TCACT AT ATTC CAGGCAT GAT ATT AGTT AT GGGAC T ACAGAGAT C T GGC TT GTTC C TCTGTTC TTAAT 
5 2 872 GACAGGACAT AGC C C TGCAT GGC T GAAT AAT AGAAT GCAGCAT AT GCACAGAAC TTT AAGAAT GACT AAC TACAGGACT AT AAAGAT CAGA 
52 963 AAAGAGAAAT CAGT TTCAGGC TGGAGCATTT GTGGGAGGC GTT AT GGAAAAACAC CAC TT AAAT C TT GT C TT GAC T ATAT ACGT GGAAGAA 
5 305 4 AGGC GC T AGT GGGGAAGAGGGAGC CAGGACAGTGGC C CT GC C T GATC CAAGGT ATCAC TTT CAGC TGC TTTAGTT ACC CAT GGTTTGAAAG 
5 314 5 T ATT AC TT ACAAT AT GAT ATT CAGAGAGAGAGAGAGAGAC CACATT CACAT AAC TTTT ATTT CAGTAT ATTGC T GT AATT GTT GTT AAT C T 
5 32 3 6 CTT ACTC TAT CT AATTT GT AAATT AAGC TTT ATCAT AGGT AT GT AT GT AT AGGAAAAAAACAT AATAT AT GT AGGGTT C GGTAC CAT C C GC 
5 3 327 AGTTTCAGGCATC C C CT GGGGGTC TT GGAAC C TAT C C C C GAGAATGAGGAGGAC TGAGT AC CAC TGAGT GGAC CAGCAGGACCAC GAGAC T 
5 3 4 18 CTGTCAGGAATGC C T GT GATGCAC C GGGTTC CTCTAAGT C CAGCT GGAGCAGC GTGTTCTCATGAGCAT GCT AGGACAAGGAGGAAAGATT 
5 3 5 0 9 GGAAATATGATAATATCAGTAAATGAGGCTCTGAATACAATTTAAATGTTTAAGAAA 

5 3 6 0 0 TTGC CT CAATTTTTT CT AC TAGTTATAAATCTTGC AATC CAGT ATC T CAC GGGTATGATCC TTTC CATGACAC CAC CTAGGTAAAAT GTAC 
5 3 6 91 TT ACAAAAT C CCT AC T AAACAT GC GTTT C T CATT AGC TCC CAGCC C CAGGAAAAGTT AC TC CAAAAAGAAC CAGT GC C CAGTAAAAC TAGT 
5 37 82 C C CAGGCAAGAGAT ACAAGAGGAAAT GGAC TTTC C T AGGGGT C GC TCC TTT AACAAAGT CATT AGAGTTT CAAT AATCAGGCTATCATT C T 
5 38 73 C C C CAGAT GGC C T ACAGAAAT GTT GGATTTT C AAAAT CAAATT CAC CATTTT AAAGTT GTT C T C C TAC C C TTTT AATGGT GATT C CACAT G 
5 39 64 GGAACCAGTCCCACCTTTCTCCCCACTGGCTCOSCTGCTG^ 
5 4 0 5 5 CTTACACGTCACCTGAGGATTTGTCAAAATGCAGATTTTGGCCTGAT 

5 414 6 GGAAGGAC T GCTT GAGC C CAGGAGTT CAGGAC CAGC" C TGGGCAACAT GGC GAAACT C T GTC T C TTTAAAAAAAT ACAAAAAAAT ACAAAAA 
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5 4 601 GC CATC CTTATCATCATTTACCACTTT^ 

54 692 AAAAAAGAAGTT GT AAAT C T CAAT AAAATT AT GT CAGAACAAAAAAAAAAGCAAACAT GCT CAAACTGC T GCAC C TTGAAC CAAC T ATAGA 
5 4 78 3 CGTCAGTTCATCCATTCATTCAGCAGTTATTGAGCTCCTA 

5 4 874 AGT AGGACAAGGC T ACAGACAAAACAGT AC T GCT GC T CATT ATTCAT C TC TAT GTGCAAACACAACAGAT GGC C T GC C C T C CAC TTCATTC 

5 4 965 T ACAGAGAT CAGAAGTC GT GGGT GAGAGT GT GCT GATGGT GCACAAC T GT GT AAATT AC TAAAAC TCAT CAAAC T C TAAACAT AAAATGGG 

55 05 6 TGAATGTTACGCT'ATCTAAATCATACCCTGAATACAGATAGTAGATGA 

5 514 7 TT ACAGC TTTTAAT GAATAGAGT AAC TATTTTTT CAC GTT GTTTGT GC TGT GT ATTTT C TC C TC CAC TC T GCT AACAGACAGCAGAGGT GA 

552 38 C GCAGAAGT GGAAAATAT TT GGGGC TTGAACAGC T C T CAGGTTTC C C C TTT AGT AAGC T C CAGC TTCT CAGCAGAC TT GGGCT GT AGATC G 

5 5329 AT AAGC C CAGATTT C CAAGGT GAC CATAAAT GAGT C T CTGTT GCT GAGAGC T AATAGC GGAAACATTT GC TT GC CAGGT C CAC T GC GTC TA 

55 4 20 GT GC CAC CAC GGGGCAGC GAC CTTT C T GAC T CAGC T GT AGT GGAAGCAGAAACAGC CAT AAAGAATC C T GGCAGC C TGATT TGC T GCAGC C 

5 5511 AGTACTCATCCAGCCAGTCCTGCAACTCTTCAAACTGT^ 

5 5 602 AT C T CGGGTTCAGAAGC CAAGACAAGAAGAT AGAGAGAT ACAATC TT C TAT C TT C C T C TTC T GC TCC C C CAAACAC C C C GAC CATTT AGAT 

5 5 693 TTTTAAATTTTCTTCTTTATACTTATCTGGACTTTCTCATTCTTATAAAGACTAAGTGATGTTAAAT 

5 57 8 4 AGT AGT AC T CAC T GGCAC CAGTT ACAGC TT GC CTT T AAGAGAAGT AGTTT CAGATACAC C C T GAAAGGGTTC TGCAGCAT ATAT GT GGT CA 

5 5 875 T AGGCT CAGAAAACAGGC T GTTGAGT GGTGGC GTTTTTAAAC T GGAGTTGGGGT CT AT CAACAGGAAGGAGAAGGAATTT GTC C GAGTAGC 

S 5 9 6 6 CAAAGGACAAC CAAATGT GT AGAGT GT AGGT GGAAAAAGAAGCAGT AGTTTT AAC TT GAGAC CAAGGC CAT ATGC C TGGC TTAT AGC TGGA 

5 60 57 AAT GGGGAAATGGC T TT C C T AGGCAGT AT AT GTGGC GTT GGGGTT GGGAAT AT GGGCAC TCAAGC CAGATTGC C T GAGTT CAGATC C CATT 

56148 C TGC CT CAAC TAGAT GT O T LiAL. l. l. i ^jJ^lH/\^j i lALLirtLLLimyi ii^^^Liui/vvvAi^u^nn^^-^*-.. 

5 6239 AGTT AGTGACAGAAT AAAAT GAGT T AAT ACATGGAAC TT AGAATAAGACTT CAT ACAT ACT AAGGGAT CAGT AAGT GT AACATT GTT CAGT 

5 6330 GGGGCAAAT AGGGGACT GAT GGATTT GAGT GGGAAAT AGAGAATT AAT CT GAC TTAAAT AC GGAGATT GT CT AT C CAT GATTT GTC TGTC T 

5 6421 C T AT AAAGT TTGAAT CAT AAGACACAGT GAT GCT GAT GAGACATT GGC CT GGGAGCAGCAGGATT CT GGGTTT AT ATC CAGCT GT GC TGTC 

5 6512 C CACAGGT AT GTGAC TGGACAGGGCACTT CAC CT C TTTGCAT GTT AGTTT CAT CAAC T ATGAAAT AAAGAGAC T AGAAT ACAGCATC TC T A 

5 6 603 AT AGTTT ATCAT T C T CAT ATT GT ACAAAT AGTTCATTTAC TT AGC C T GGGT C T GTCAGGCAT AAT AAC GC TAC CAT GT GC T CT GGC TTCAG 

5 6694 CT GT GT GCAGGGAC T CTT C T GAACATTT GAT ATGTTT CAAC T AATTT AAT C TTT ACATT AATTT ATGAGGT AGGC T C TTAT CAC C CACACA 

5 6785 T CACAGAT GAAGAAACT AT GACAT GAAGAGGTTAAGT AGC TT GTTT AAGGTT GCAAAGC CAGT AAGCAGCAAAGC GGGATT CAGAGTTGAG 

5 68 7 6 CAC T CT GGC T C CAGAGT C CAT C C T C TTAATT GCCAT GCT GAGC TGTTCCCTC TACT GAC TAT ATTCAGTT GC T AGT AACAGAAGGAAGAGT 

5 6 967 AGC TTAAAT AAGAAATTT ATT TT T C T CT CACATT AAATAAGAT TGGAGGT AGT C GAT GT AGAGC T GT GT AGT GGC C TCAT AAAGT CATCAG 

5 7 05 8 AGAC CCT GGTTC TTTTC CAAT C C TTT GC CAT GC CAT C CT GGTT CT AGT GT AC C CATT C T C GT GGTCAT GATAT GGTTGC T AGGGC T C CAGC 

5 714 9 CAT CATGAC CACAT C TAGGCAAGT CAGGAGT AGAAAT GAGGAAACAGCAAAAAGAT GT GC C CATT TCC CAGT GC C T TCAC C TAT ATT AT CA 

5 72 4 0 GC GATC C C TAC C T GCAT GGGAGGC T AGGAAGT GT AAGTTTT CAGGT GGTCACAC TGC C T GGAGT T CT GC (IAGT AGGGAAGAAAGAAT GGAT 

5 7331 ATTGAGAAAACAACTAAC GAATGTTTGTCTGC CACACTGAGGAAC CCATGTT ATGGGCT GTGC TGAAAAAGGGGGGC CAAGGCTGGGT ACAG 

5 74 2 2 T GGC TAC GC C TGT AATC C CAGTAC TTTGGGAGGC T GAGGT GGGC GGAT CAC TT GAGC T CACAAGTTC GAGAC CAGC CT GGGCAACAT GGCA 

5 7513 AAAC CT C GT C TTT ACAAAAAAT ACAAAAAAAATT AAC C GGGT GTAGT GGC GT GC CT GT AGTT C CAAC T GC TC GGGAAGC T GAGGTGGGAGG 

5 7 604 ATCACITGAGCCCAC^GC^GAOT 

57 695 AAAT AGAGGGGGT AC CAAGAGAT GCAGGGGGGGTGAGGGCAGCAT GAC TAC T C T CT C T GT AGGAGAC C TT AAC T C T AT AAATGGAGGCC C C 

577 8 6 AAAATGTTAC TGC CATCAAAAGC CAG^SAATC CTTTTCTGGAGGCGTAACTTC C TGC C C TTT C TAATC C CT ATCAATCTGGTTTCT GT AGAA 

57 8 77 CT GT GACT GCTAGAAAAC C C CAGGCATATTT GTT CT AAGAAAAT AC TTGTGTTC GGTGAATTT AC CAACAAAGGGAGCATCAGAGGATGTG 

57 9 68 AGGGAAGT C T GGAATGGTT GT ATCAC TAAGTGAGAGCAGCACAGATGTTT GT GGAC C T ATT GAGAAT GT T ACAGAT AAGAC CATTTTTGAA 

5 8 05 9 AAGTTGTTT GCAGTGTCATTTTATGATC TTGTGTACATTTT C CAAGC GAT GT GGCT ATTCT C T AGGAGGCAT AGT AGAAATTATTTCAATT 
5 815 0 TT AATCAAAT AAC C T AGAGAAT AT AAC C CAAATGAC T CAAAGGAAGAAAT GT ACAAAAAGT AT AT AAAAT AATTTTTT GCATT AT AAAAGT 
5 8 2 4 1 TTAAAGACAT AAAGT AAT ATT AC TACATAAAATCTAAGTTTTTTAC T C CAGCTATT AAT AT GTTTTT CTTTAT AAAACAT CACATTTATT A 
5 8 33 2 ATT GCTGT GT AACAAAC TAC C TCACAATTT AGTGGC TT AAAAGAAAATTT AATT ATT AT GCAT GT GGT ACAT AAT AATTTTTGC TTT CC T C 
5 8 4 2 3 ATTTCTACT C CTGATAC TT GC CTATGATGTOTTCATGATGGC T GGGGC C CTAGC GAGGT GT ATT GTGGC CAT GAGAAT GGTTTT GCT GCAA 
5 8 514 CTTGGCCTTGGCTGGGCTCAGCTAAGCAGTTTTTGCCTX^CT 
5 8 60 5 TTCCTCACTCATGTCTGGTGTCTG^OTCACAATACTCAAAC^ 

5 8 6 9 6 CCTGCATGACAGCTTTAC^GGTAACTAGACTT CrK^CAGGATGGTTCAGAAC T C TTAT AGTACAGAAC GACACACAGACAGAGGCCAT ATT A 
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59151 GAAGCTGTTAC^TACXITGCTGCT^ 

5 92 42 AATTAGCTGTTAGCTTTGGCAGGACTCATCCCACCTCCTCT 

5 9333 TTATmACACATGGTGCTA 

5 94 2 4 CAT ATGC C TT CC C TT CT ATTTCTT C C CT AAAACC GTT AATT CATAAC C TT AGCT ATC CTAT AATTTGC TTCAATT AAT GACAAACAT AAAA 

5 9515 GATAG<7TCAAAATCCAGTTAAG^ 

5 9 60 6 GC T GTGGT GC T AAGTTT ATTTTT GC C TAGGCAGGT AGCT C TTC CTT GC CACATGCT AT C CT CTT C CT C TTTGAC T GTCAT C CTGAAAATGT 

5 9697 GTTCAGTGTTAAGAC^GATTTGATAA 

5 9788 ATATTGATCACC CAC GTCTTATTCATTTGAAGGATAATATAGTTAT^^ 

5 987 9 C TTT AAGGTT TATT GATTT GT AAGTT C C C TT GGGT AGAAGAGT ACAAAACAAAAGGCAAAAC TT CATCAT GTTTAAAT AAC GT GATTTTCT 

5 9970 T C TTTGGT GT GT AC C GT AC C GTGCAGGAGACAGGGAAAACAC CAGGTTTAAC GGTC CACAAGC CAGAT AAC CAT GATGAGTTC T AAAGATT 
600 61 TT C T CT GT C TTT AAT GTT C TTCAGTTTT GT CATGAGAAGAC T GGAT AT AGC T C T GTT GTTGTT CATT AGAGGATTTAT ATC TCTCTTCAAC 
60152 TC TAGAAAATTACCAGCTGTT^ 

6 02 4 3 TC C TTAT AT C TC TAT GTTT C TTAAC CAATTTTTT CAT ATTTTTTT ATTTTTT AAAT C T CATT GGT GATTT CC TT GGATC TGTCAT ACATTT 
6 03 34 AC TT CTC CAATT AT GTC T AGT CT ATT ATTT AACC TTT AAC C TTTC CATTGAGTTTTTTGTC TTTT AGCAACT GCATTTITTTTTTTTTTTT 
60 4 25 TTTTTTTTTTTTTC4GTGAGACAAGGTCT 

6 0 S 1 6 AGGC TCAAGC CAT C TTC C CAC CT CAGCTT C C CAAAT AGC T GGGAC T GCAGT CAC CT GC CAC CAT ACC CAGCTAATTTTT AT ATTTTTTGT A 

60 607 GAGATGGGGTTTT GC CAT GTT GC C CAGT C T AGTC TTT AAC TTGTGAGC TCAGGCAATC CAC C CAC CTT AGCTTC C CAAAGT GC T GGGATT A 

. _ _ _™ m ^ r-rr-^r (^*£AC C CAGC C T AGT GAC TGCATTTTT C TTTTTT CT AGAAGTT C T GTTTGGTC C TTTTC TGAAC TTTTC C TGGTC TC 

60 78 9 TT TTTT CATT TT C C C TGTTT ATTT GTTTTT AT CAT GGATTTT ATAC C TTTT ATTTATTTTC TTTTTTTTTTTCT AGTT 

608 80 GGATTTT AT ACATTTT ATT AC CT C T C TAT ATT AAC TTTGGTTTTTT ACATT GTTTT ATT AT C TC T AATT C TT ATT GGACT AATT C TTTT GT 

60 971 TTGATGCATCTGTTAGTTTTCCCTCGTGGTGGTTGGTTTCTTCATATGGTT^ 

61062 T CAT AT GC C C TGATT GAGAAT GTGTTCCTC CAGAACAAC TTT ATGTT GGTTT GGCT GAATC C T AGCAATTTCAGT AAT C TTGGAC TGGC TT 

61153 TT AAGTT ATTTT C T CAC C TT GAAGCACAT ACAGT CAAGGAAT GTACATTT GT AACTT AT AC TAT GTGT GGTGCAAGCC T AGAATTTC CATT 

612 4 4 T C T CAAT AT GAC TTT CTTT T C CAT AAAT GGC C CT AAGCT GAT AGCAAGTTTTCATTC T GCC T C T CTGGACAT C TTGCAGCATTTTT C TAAA 

6133 5 CCCTCTTTCATACATOXATAGCTTTTCAAGC^TCT 

614 2 6 T CAC T AAT C TTGT GT GAGC TTTGAAGC CCCTTCCCCT CAGC C CAT AGACCT AT ACACAGTC C T AAAAT C TTAATGGGCAGTCC T AC T GACA 

61517 GCTGCCTTGTCACCAGCTCCTGTGATCATTCTAGCTTTGATTTTTCTCT^ 

61608 TTTT GC T GT GCTTTC CT AGCACTT C CAT AT GT AC GT AGCAGGAGGAGGCT GAAT GC CAT CTGC TC TGTCTGC CAT GTT GC T GT AAATCACA 

61699 GT GAGTTTTTTGT AAGT GT AACAGC TTC CATT CT GCAGT GT GTTTT GAGT C T GACT C TT AGATC CATCAC TT C C T CACAGT GTT ATC TTGG 

617 90 GCAAGCATC C TAACATTT C T GAGTTT CAGAAACAAAATAGAGATAAAT GC T GAC TTTTT AGGGTT GTT GGGAAAATTAAACAGAT AATGCA 

61881 T GT AAAAC TT CTT GAAAC TT C TT C T GGCACAT AGCAAGT CAGAGGTTC TCAATTCT GGC TAGGTT GGC TTCAGT AT C C C CTG<iGCAACTTT 
61972 TT AACATT AGACATTTC T GGGC CACAAGCAAT GGC C CACAC TT GT AGT C C CAGCTAC C CAGGAGGCT GAGGT AGGAGGAT CACTGGGGC C C 
62063 AC^OTCCAAGTTTGCCATGAGCTGTC^TCATGCCACTGCAC 

62154 AGGT AAAAAAAAAAAACAAACACAGAGAT TTC TGGGT GC T AC C C C TCAGCAGTT AT GATTCATT AAGT C T GAGAT AGAT C C CAGAAATC T G 
62245 CATTTT GAAAAGC T C CACAGGTGAT C C C GAT ATGC CAC C CAGTTT GAAAAC GTTCTT AAAjTT ATTC GAAAAAT C GTAAGT AATTT CATT G 
62 33 6 TT C T CAGATT TC T AAGCAC TT CAAAGTCATTT ATTTCTC C CACAC T GATATTTT CATC TCAGATGTGGTGAAGC TGTAGAGAAAAACAAGC 

62 4 27 QTCTCATCACGGCAGACCAGAGC^CAA^ 

62518 GAAAATT C CAGAAC T GT ACAAGC CAAT AT T CAGAGTT GAGAGT CAAAACAGGT AACAACAGGGCACAGGAGGC CTC TT C C T GT GGGAT AAA 
6 2 609 GAGCAGCGCATGGGGCCTAGCACCTTGGGG'CAT 

627 0 0 CAGAGAGGCTACCAGAGTGTGATTCATTCTGCCTCTGTCCTCCCCATCCCTGCTC 
627 91 GCCAGGCTATTCCCAGGGCTGAATGATGGCCTGTGTTGGTTTTTTTGT^ 
62 8 82 GGTCGCAGACCTTTCACTTATTATTTGCTGAGTTGTCCATGACT^ 
62 973 CTAACTGATCTTTTCTTGCTTCTGTACGCTCTCT 
630 64 CAGGAAATCTGAAACCCAGTTGTCACAGGGC^ 

63155 TGCTGCTACTTAAAAAATGGGACATTTGCCACCCAGCACTGACTGTA 

632 4 6 AAC CAT GGAATT ATT C C CAAATGGAC TC T GAC CAGATTTTT GC CAT AC TGGGGGGT GGC GGGAT GGAGGATGGGT ACT CAGGCATGACT GC 



B 



0 % 

€ 3 7 0 1 TGTTTTATTATTACTtTITACATTAAT^ 

63792 AGT CAGC CAGAAATCACAGAT AC T GC TTTCAC IT AAATCGAAACAATT CTC C GATAAT GCTTTGC TTTTTTTC TTATGT CACTC TTGTGT A 

638 83 CTATCTATTTTTCTCCTCTCTGGGA-CCAAGTTTCTTTT^ 

63974 cATATCTATATCAGCmCAAAATATATTCAAC 

64 065 TT AAATT AT ATATTTTT AAT ATGAC T GT GAC C TTGACTGAT AATAAAGAT GT AATAAGAATT GCAAGC T AAAT GTTTC C C TTT GCAACT CA 

6415 6 T GC TTTGT CTTiT G'i. TTT GATGAC C T AC T C GC TC GT AAT GTTTTGT AAGGCAC TTCAGAGAGAAGACAGATGCAT CAT C C T GGC CTC CAT C 

64 2 47 AAAT AACAC T ATC CAAGGTGGCAC C T CTT C T GCAAT GTTTAAC C C T GC TAGT AATGAAC GAT GAC TT AGTTC GGAT ATTT CAGAAC TTTTT 

64 338 GTTTATACCATCAGGTAT GCATGAATTT AT AATC T GAAAGAGGAC TTAAAATAATAATTAAAACTTACCAGC TT AAGTGC T AAAC TTTTTA 

64 4 2 9 TTTTTTAGGTATTTGGGGAAGAAC TCTTTTTAAAGTATACAC CTAACTGC TTTTTAAAATGAGTACACAT GACAT ACTTT AATT C CATATG 

64520 TATTCCCCTACTCTTIX^SGAGACACTCTCTTGACACCA 

64 611 ATCATATACCTGTGTAGTAGC CACAGTACAAAACAGACTAGAACACAGCC CATAQ C 

64 702 AACACCTATGGTATTAGATTCTC^CCTAAAACAATAAGAGTTAGATC4CTAAGTTA^ 

6 4 793 C C C T AGT AAC CT AGAAT ATT C CT GATT AAAT ATC C C C TGC TTTTAGAT AC C T GT TGT C CATTTGGGTTT GTTTTTT ACAGT CT C TTTTGT A 

64 8 8 4 C CACAGT GGATACATTT GC TT CAT GAGT GCAGGAAC CAT GTT CAC T GC TGCATT CTT AC C C C T AGC C C TC-CAACAAACACACAAAAGAT AC 

64 97 5 CCAATAAATATTTCTTGATTCACTAAATGA^ 

65066 CAT ATAT AAT TTT AAAAATT C TAGTAGC CATATT AAAAATAAT AAT AGGC CAAGTGCAGTGGC TCAT ACATGT AATAC CAGCAGTTTGGAA 

65157 GAC CAAGGT GGGCAGAT CAC TTGAGC C CAGGAGTTT GAGAC CAGC CTGGC<LAACATGGCTAAAC C CCATCTCTAC CAAAAAAGATAT AAAA 

6524 8 AATT AAC CAAGT GT GGT GGCATGT GC CT GT AGTC C CAGC T AC T C GGGAGGC T AAGGT GGGAGGAT C GC TT GAGC C CAGAAGGTT GAGGC TG 

65 33 9 CAGT GAGC CATGAT C GT GT CACT GCACT C T AGC C T GGGT GACAGAGT GAGAC C C TGT C T CAAAAAAT AAT CAGCAT CAT AAAAAGAAAC CA 
654 30 GCAAAATT AACTTT ACT AGT AT ATTT AAC C CAAT AT ATAT AAAAT ATT ATT T CAAT AT GCTT C CACTAT AAAAAATTATT TTACAGT CTTT 
65521 T ATTTC CAT ATT AAGTC TTT AAAAAT CT GAT GTGT AGTTT GT ACTT ACAGCAC GTT GCAGTT AGGAC T GGC CACATTTT AAGT GCACAGT A 
65612 GC CACAGGGGGC CAC TGGC TAC CAT ATT GGAT AGT GC CATT C T AGAAGCTTTCAGC TTTTT CAAC TGGAT GC C TC T GATTT GTGGAC TCAG 
65703 AAT ACAGAT AAC CAAAGAAGT GGGAC TAGT GT CT GAAGT AAGAAT GACAGGGT ATGATT GAGAGC C C CAT GAGC TTAC CTAGGAGAGAAAC 
65794 TT GT GGGGTT GCAGAAT AAGGATTTGTCAAT ATT GGC TC T AGC TGTT CACAC T ATTT C T GGGC CAAC TC C CAGATCATTTC TCAAC T C CAG 
658 8 5 AT AGTT AAGT GGGGAGCAT GGCT GCACTTTTT AAAGT GAT GGCACAAAAAAAGAT ATT GAAC GTT GGT C C TC T GATTATAT ATT C T AAAT A 
65 97 6 T GCAGTT AGAAAAGAGGC C T TTT AAGAAT C C C TAAC^GT AAAGCAAATT AGT AT C TT T GTTT C C T GAAAATT AGAGAAAC TTGAT AT GC CA 
660 67 TGATAGCCCTCTTCATTTTATTTGGAAAACTCTTCTATGAAA 

6615 8 CAAGGT CAAGGGT GCAGTT GT CAC T ATCACAT AAGAATC T CAT AAAAATT AAACAT GAAT AT AC T GCACAGATC T GATT GGGTTT GT CAT G 

66249 C CACACATT GTTTT AAATT C CAT AATTC T ATT CT AT AAAGAGT GGTTT CT AT GACAAT AGAT C GTTTT AAAAACAAACAAACAAACAAAAT 

663 4 0 TT AGAGTT GT CATT GGT AATT GT GGT TGCAAGT AT GC TT T CAAAGAC CAGAAGC TTTT GTTTT GC TTT GAAT GT AATTTTTTT C TTTTTTC 

664 31 TTTTTGATAC GGAGTCT CAC TCTGTTGCC CAGGC TGGAGT GCATTGGCAC CAT C TCAGC TCAC TGCAAC CTC CAC CTCCGT GGTTC^U\GCA 
66522 ATTCTC CTCXT CTCAGCCTC C TGACT AGCTGG<^TT ACAGGC GTCCAC CAC CAC GC CTGACT AATTTTTGTATTTTT AGTAGAGATGGGGTT 
66613 TTACCATGTTOXCAAGCTGGTCTCAAATTCCTC^CCTCAGGTCA 
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Putative promoter sequence of human CLASP-5 



GGAACAATTTCCTCTCATGTGTATGGCTCCCTAAAGTGTTGGCTGAGCATTGTCCACATGGGTG 
ATGCAAAGGATCACTGAACTAGGAGCAGTTGGGAAAAAATACAATCATTGGGAATTCCTGTAGC 
ATCGAATGTGCCTACAGGGAGGTAGAAGTATTCATACAACAGTTCTCTGGTGTTCTCTGTTGTA 
GCAACCAGTCAGCCAAAAGGGTTCAGCTGCTTGAAATGAGAATGGCTGGATCAAAATGGCAGCT 
CATGATTTAAAGGATTCTAGTCAGATACCAGACATCCTCACATAGAGAAAACTCTGAATGGCTG 
GGGGAGAAGGAGTCAAATGCCCTGGATCTTTTTCTTGGGCCTCAAAGTCCTCCTTCTGTCATCA 
TCCTTCCAGTATTGGGCAGGACCTGACTGCAGGCATCATGGCCTCTGTGAACTTCTCAAGGGTA 
TGTATTATCTGACAAAAACTACGATGTCCACTAACAGGCCACTGAAAGGTATCTTAGTCAGTTC 
TGCTCATTGCCCAGCCAAGGCCTACGTTTTATAACATGATATCAAAGATTGCATCTAAAATTGT 
GATGATTTCCTAAAATAATCATTTCATTTAGATTTTTCTATTTTAATCCAAGGTATTCTTCAGC 
GGAAATAAGGAAACAGTTTACTCTCCCACCAAACCTTGGCCAGTACCATCGACAGAGCATAAGT 
ACCTCTGGCTTCCCCTCTCTTCAACTAGTAAGTATGAGTTCCAGGTTTACTTAGCGATTGGTCA 
AGTGCAAAAGTGCCCAGGGTATGTGTTTGCCTCCTGTTCCTTAGATCTTCCTACCATCACCTCA 
CATTCTCCAGTCACCAGATCCTAACTCTGTGACTGTGTCTGGACATCAGACAATATCCCTCTCJT 
CTCTCTGCCAACCGGTACTTAGGGTACATAATAGAACCTCTGGGAGCTGTGGTTTTGATGTCTC 
TAGACTAGGTGGGCTTCCAGGTGACTCAGTCTCATCCAAATTATGGTTCATATTTGGGGGAGAA 
GGGCTAGCCCAAAAACTTACCACCATTTGTAGTATGCATTTTTTTGGAAAAGCATATTCCAAAA 
TCTGAAATGCCAAGTTACAGACCTCCTTTTTGTAAAATAATTTTCTTGCTAGTATAATTTACAT 
ATAATAAAATTCACACATTTTAGGTGTACAATTTGGTGAACTTGGGCAACTTAGAGTCACTTAA 
CCTTTCCTCAGTC7\AGATATAGAACACTTCTTTTATCCTAAAGCGTTCCCCAGCGCGCTTTTAC 
AATCTCCTCTCCCCAGGCCACACCCTCCAACTCACGCAATCTCTGACTCACTTCTGTCACCATA 
ATTTTGCTCTATCTGGAGCTTCATATCCTGTTACAGTATGTACAAACCTTCTTTTTTTGAGACA 
GGGTGTCAGTCTGTCACCCAGCCTGGAGTACAGAGGTGTGATCTCAGCTCACTGCAACCTCAAC 
CTCCCAGGATCAGATGATTCTCCTCCCACCTCATCCTCCCAAGTAGCCGGGACTACAGGCGCAT 
GCCACCACACCTGGCTAATTTTTGTACTTTTTGTAGAGACAGGGGTCTCGCTATGTTGCCCAGG 
CTGGTCTTGAACTCCTGGGCTCAAGCGATCCTCCTGCCTCAGCCTCCCAAAGTGCTGGGATTAC 
AGTGAGCCACTGCACCTGGCCCTAAACCTTCATTTTTAAAACACATTTCCTCTTAAATTGAAGA 
TTGCCTACATTTTTATATCAATGCCAATTGTTGAGTGTGCCTATATGTGTTATATTATTTGAGC 
ACTAAATGCCAGATGTGTGCC7\AGTGAGATAAATCTGAC7VAATGAGATGGTTTGTAAAACCAGC 
AGTGAATATTCACTTCCTCTGTGAGAGAGCTCCAGCCCTCCTGTACTCACTTCCTCACACAGCA 
CAGCAGCACTCTTGCTGGTTCTGCTGCTTATCTTGAAGAGGTTAGGTTACTTTTTGTTTCTACT 
TATTACTTCGAAACCACTTCTGCCTTAGAAATTTTGTAACCTTCCGCTCAGTTTCCGGTAACCG 
CCATTTTGTCTCCTGTAACAATTTACGCGCCGTGTAACTGTGAATCTTT 
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hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



-M FPME D I S I S V I GRQRRTVQ- 
-MTHLNSLDVQLAQELG 



20 

16 

KAERRAFAQKI SRTVAAEVRKQI SGQYSGSPOLLKNLNI VG 41 

MLLFPY DD FQTAI LRRQGRY I CS 23 

MAASERKAFAHKINRTVAAEVRKQVSR£RSGSPHSSRRCSSSL 4 3 

MSFRGKVFKREPSEFWKKRRTVRRVIQEEF^ 60 



hCLASP4 STVTEDAEKRAQSLFVKECIKTYSTDWHVVNYK 53 

hCLASPS DFT 19 

h CLASP 3 N ISHHTTVPLTEATOPTOLEDYLITHPLAVDSGPLRDLIEFP 83 

hCLASP2 TVPAKAEEEAQSLFVTEC IKTYNSDWHLVNYK 55 

hCLASP7 G VPLTEVVEPLDFEDVLLSRPPDAEPGPLRDLVEFP 7 9 



hCLASPl DPLQDLLFFPSDDFSAATVSWDIRTLYSTVPEDAEHKAENLLVKEACKFYSSQWHVVNYK 120 



hCLASP4 
hCLASPS 
hCLASP3 

hCLASP2 
hCLASP7 
hCLASPl 



YEDFSGDFRMLPCKSLRPEKIPNHVFEIDEDCEKDED SSSLCSQKGGVIKQG 105 

DDDLDWFTPKECRTLQP-SLPEEGVELDPHVR DCVQTYIREWLI 63 

PDDIEWYSPRDCRTLVS-AVPEE-SEMDPHVR DCIRSYTEDWAI 126 

YKDYSGE FROLPNKVVKLDKLPVHVYEVDEEVDKDED AASLGSQKGGITKHG 107 

ADDLELLLQPRECRTTEP-GI PKD-EKXDAQVK AAVEMY I EDWV I 122 

YEQYSGDIRQLPRAEYKPEKLPSHSFEIDHEDADKDEDTTSHSSSKGGGGAGGTGVFKSG 18 0 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



wlhkanvnstit--vthikvfkrryfyltqlpdgsyilnsykdeknskesk-gciyldaci 162 

vnrknqgspeic — gfkktgsrkdfhkt-lpkqtfesetlecsepaaqa — gprhlnvlc 118 

vi rkyhklgtgf — npntldkqkerqkg- lpkqvfes deapdgns yqddqddlkrrsms i 183 

wlykgnmnsais — vtmrsfkrrffhliqlgdgsynlnfykdekiskepk-gsi fldscm 164 

vhrryqylsaay--spvttdtqrerqkg-lprqvfeqdasgdersgpedsndsrjigsgsp 17 9 

wlykgn™stvnntvtvrsfi<kryfqltqlpdnsyimnfykdekiskepk-gcifldsct 23 9 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



DWC?CPKMFJ^HAF^LKMLDKYSHYLiAAETEQE>IEEWLITLKKI IQINTDSLVQEKKETVE 222 

DVSGKGPVTACDFDLRSLQPDKRLENLLQQVSAEDFEKQNEEARRTN RQAE 169 

DDTPRGSWACSIFDLKNSLPDALLPNLLDRTPNEEIDRQNDDQRKSN RHKE 23 4 

GWQNNKVRRFAFELKNQDKSSYLLAADSE^MEEWIT I LKKILQLN FEAAMQEK 219 

EDTPRSSGASS I FDLRN1JVADSLLPSLLERAAPEDVT)RKNETLRRQH RPPA 2 30 

GWQNNRiPJCYAFELKMNDLTYFVl^AAETESDMDEWIHTLNRILQISPEGPLQGRRSTEL 2 99 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



TAQDDETS S QGKAEN IMASLERSMHPELMKYGRETEQLNKLSRGDGRQNLFS FDSE 278 

LFALYPSVD EE DAVE I RPVPECPKEHLG N RILVKLLTLKFEIE 212 

LFALHPSPD EEEPIERLSVPDI PKEHFG QRLLVKCLS LKFE I E 277 

RNGDSHEDD EQSKLEGSGSGLDSYLPELAKSAREAE IK LKSESRVKLFYLDPD 2 72 

LLTLYPAPD EDEAVERCSRPEPPREHFG QRI LVKCLSLKFEIE 273 

TDLGLDSLDNSVTCECTPEETDSSENNLHADFAKYLTETEDTVKTTRNMERLNLFSLDPD 359 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 



VQRLDFS GIEPDIKP- FEEKCNKRFLVNCHDLTFNI LGOIGDNAKGPPTNVEPFFI 3 33 

IEPLFAS IALYDVKERJQ<ISENFHCDLNSDQFXGFXRAHTPSVAASSQARSAVFSV 2 68 

IEPIFAS LALYDVKEKKKISE^FTFT)I^SEC^GLLRPHVPPAAITT1ARSAIFSI 333 

AQKLDFS SAEPEVKS-FXEKFX;K^IL\T<CNDLSF>JLCX:CVAENEEGPTTN^PFFV 327 

TPTFni LALYDVFFKKKT SFNFYFTT N c PSMKGT.T.PAHGTHPAI STTJVRS AT FSV ? 7 9 



H<i S 



* 



hCLASP4 

hCLASP5 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



NIJVLFDVKNNCKISADFHVDLNPPSVREMLWGSSTQLASDGSP KGSSPESY IHGIAE 390 

TYPSSDI YLWKIEKVLQQGD 1 GDCAEPYTVIKESDG GKSKE-KIEKLKL 317 

TYFSQDVFLVIKLEKVLQQGD IGECAEPYMI FKEADA TKNKE-KLEKLKS 382 

TLSLFDIKYNRKISADrHVDI^FSVR(>CLATTSPALMNGS GQSPSVLKGILHE 381 

TYPSPDI FLVIKLEKVLQQGD ISECCEPYMVLKEVDT AKNKE-KLEKLRL 378 

SVALYDLRDSRKISAJDFHVDLNHAAVRQMLLGASVALENGNIDTITPRQSEEPHIKGLPE 479 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



SQLRYIQQGI FSVTNPHPEI F1jV7ARIEKVLQGNITHCAEPYIKNSDPVKTAQKVHRTAKQ 4 50 

QAE S FCQR LGKYRMP FAWAP ISLSS FFNVS TLEREVT DVDSWGRS PVGERRTLA 372 

QADQFCQR LGKYRMPFAWTAIHLMNIVSSAGSLERDSTEVEISTGERKGSWSERR 437 

AAMQYPKQGIFSWCPHPDIFLVARIEKVLQGSITHCAEPYMKSSDSSKVAQKVIJ<NAKQ 4 41 

AAEQFCTR LGRYRMP FAWTAVHLAN I VS S AGQLDRDS D SEGERRPAWTDRR 4 29 

EWLKFPKQAVFSVSNPHSEIVLVAKIEKVl^GNIASGAEPYIKNPDSNKYAQKILKSNRQ 539 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLAS P 7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



VCSRLGQYT^PFAWAARPIFKDTQGSLDLDGPFSPLYKQDSSKLSSEDIIJa*LSEYKKPE 



QSRRLSERALSLEENGVGSNFKTS 
NSSIVGRRSLERTTSGDDACNLTSFR- 



• — TLSjVS S FFKQEGDRLSDEDLFKFLADYKRS S 
■ PAT LI VTNFFKQEGDRLSDEDLYKFLADMRRPS 



AC QRLGQY RM P FAW AART L FKDASGNLDKNAF FS AI YRQDSNKLSNDEMLKLLADFRKPE 
RRGPQ — DRASSGDDACSFSGFR-PATLT}\miFFT<OEAERLSDEDLFKFLAI»iRRPS 

fcsklgkyrrafawavrsvfkdnqgnvdrdsfIfsplfrqesskistedlvklvsdyrrad 



510 
427 
496 
501 
483 
599 



— KTKLQI IPGQLNITVECVPVDLSNCITSSYVPLKPFE-KNCQNITVEVEEFVPEMTKY 567 
SLQRRVKSIPGLLRLEISTAPEIINCCLTPEMLPVKPFP-ENRTRPHKEILEFP — TREV 484 
SVLRRLRPITAQLKIDISPAPENPHYCLTPELLQVKLYP-DSRVRPTREILEFP — ARDV 553 
K-MAKLPVILGNLDI T IDhTVSSDFPhTYVNSSYI PTKQFETCSKTPITFEVEEFVPCI PKH 560 
SLLRRLRPVTAQLKIDISPAPENPHFCLSPELLHIKPYP-DPRGRPTKEILEFP — AREV 540 
R- 1 SKMQT I PGSLDI AVDNVPLEHPNCVTSSFI PVKPFNMMAQTEPTVEVEEFVYDSTKY 658 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 



CYPFTIYKNHLYVYPLQLKYDSQKTFAKARNIAVCVEFRDSDESDASALKCIYGKP fVGSV 
YV PHTVYRNLLYVYPQRLNFVN — KLASARNI T IKI QFMCG-EDASNAMPVI FGKS 5GPE 
YVPNTTYRNLLYI YPQSLNFAN — RQGSARNI TVKVQFMYG-EDPSNAMPVIFGKS 5CSE 
TCPYTIYTNHLYVYPKYLKYDSQKSFAKARNIAICIEFKDSDEEDSQPLKCIYGRP 3GPV 
YAPHTSYRNLLYVYPHSLNFSS — RQGSVRNLAVRVQYMTG-EDPSQALPVI FGKS 5CSE 
CPPYRVYKNQIYIYPKHLKYDSQKCFNKARNITVCIEFKNSDEESAKPLKCIYGKF EGPL 



FT^ AYAWS HHNQNPE FYDE IKI ELP I HLHQKHHLL FT FYHVSCI INTKGTTKKQD1 VE 

FLQE VYTAVTYHNKSPDFYEEVKIKLPAKLTVNHHLLFTFYHISCC 0 KQGA£ VE 

FS KI AYTAWTHNRS P D FHE E I KVKL PAT LT DHHHLL FT FYKVS CC Q KQNTI LE 

FTR< AFAAVLKHHQNPE FYDE IKI ELPTQLHEKHHLLLTFFHVSCI NSSKGSTKKREft VE 

FTRi AFTPWYHNKSPEFYEEFKLHLPACVTENHHLLFTFYHVSC< P RPGT; iE 

FTS^ AYTAVLKHSQNPDFSDE\TCIELPTQLHEKHHILFSFYHVTCI INAKANAKKKEJ I*E 

* . .**.*..*.*..*< * ; » 



627 
541 

610 
620 
597 
718 



687 
595 
664 
680 
651 
778 



LLNERLQTGSYCLPVft LEKLPPNYSMHSAEKVPLQNPF IKWAEGHKGVFN 



T P VG FAWVP LLKDGR I I 
TLLGYSWLPI 
TPVGYTWI 

TQVGYSWLPLLKDGRWTSEQHI 
TPVGFTWI 

-cv^YAWT PT.MXimOTAFOFYNTPT 



TFEQQLPVS *NLPPGYT^TLNDAESRRQCNVi: IKWVDGAKPLLK 



747 

655 

PMLQNGRlJ^TGQFCLPVSjLEKPPQAYSVLSPEVP LP<i*KWVDNHKGVFN 721 

PVS ?\NLPSGHLGYQELGMGRHYGPE IKWVDGGKPLLK 74 0 
PLLQHGRLRTGPFCLPVS VDQPPPSYSVLTPDVA LPGJ4RWVDGHKGVTS 708 



ATST.PPNYT.FFQP^AFGKHGGStlKWVDGGKPLFK 9 38 



1 Kj. s 

2 nf6 



hCLASP4 
hCLASPS 

hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



FKS H LE S T I Y TQDLHVHK F FHHC QL I QS - 

I EVQAVSS VHTQDNHLEKFFTLCHS LES JVTFPI RVLDQKI SEMALEHELKLS I 

vevvavss ihtqdpyldkffalvnalde h 
i sthlvstvytqdqhlhnffqycqktes - 
veltavssvhpqdpyldkfftlvhvx.ee 3 
vstfvvstvntqdphvnaffqecqkrekd- 



GSKEVPGELIKYLKCLHAM 794 
ICLNSS 715 

LFPVKIGDMRIMENNLENELKSSISALNSS 780 
GAQALGNELVKYLKSLHAM 787 
-AFPFRLKDTVLSEGNVEQELRASLAALRLA 7 67 
MSQSPTSNFIRSCKNLLNVE 887 



-VLTHMTH- 



EDDVP 82 4 
775 

FEAMAS I INRLHKNLEGNHDQHG 84 0 

VLT-RAT QEEVA 816 

RPPIISG^VNLGRGAFEAMAHVVSLVHRSLEAAQDARG 827 
VLVQNE EDEIT 916 



E I QVM I QFLPVI LMQLI R 

R1EPLVLFLHLVLDKLF QLSVQPWVIAGQTANFSQFAFXS WAIANSLHNSKDLSKDQHG 
QL EPWRFLHLLLDKLI LLVIRPPVIAGQI VNLGQAS 

EG HVMIAFLPTILNQLF R 

S P EPLVAFSHHVLDKLV RLVI 
KIHAIMSFLPI ILNQLF K--- 



I NCTMV-LLH I VSKCHEEGLDS YLRSFIKYS FRPEKP 8 60 

RNCLLASYVHYVFRLPEVQRDVPKSGAPTALLDPRSYHTYGRTSAAAVSSKLLQARVMSS 835 
PJ*SLI^YIHYVFTU,PNTYPNSSSPG-PGGU^SVHYATMARSAVRP 899 

VNVTRV- 1 IHWAQCHEEGLES HLRSYVKYA YKAEPY 852 

HCPQLAAYVHYAFRLPGTEPSLPDGAPP VTVC2AATLARGSGRPASLYTARSKSISS 883 

TTVTRV-LPDIVAKCHEEQLDH SVQSYIKFV FKTRAC 952 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



SAPQAQLIH ETLATTMIAILKQS 

S N P D LAG T HS AADEE VKN IMS S K I ADRNC S RMS Y Y C S G S S DAP S S PA- 



883 
882 



SNPDISGTPTSPDDEVRSI IGSKGLDRSNSWVNTGGPKAAPWGSNPSPSAESTQAMDRSC 959 

875 

915 

972 



VASEYKTVH EELTKSMTTILKPS 

SNPDLAVAPGSVDDEVSRILASKLLHEELA-LQ- 
KE RPVH EDLAKNVTGLLKSN 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



ADFLSINKLLKYS WFFFEI IAKSM 907 

APRPASKKHFHEELALQ MWS TGMVKSM 910 

hTRMSSHTETSSFLQTLTGRLPTKKLFHEELALQVTVVCSGSVKESALQOAWFFra 1019 

ADFLTSNKLLRYS WFFFDVLIKSM 899 

WWSSSAVREAILQHA WFFFQLMVKSM 94 2 

DSPTVKHVLKHS WFFFAI ILKSM 995 

* ..:*** 

Cadherin Cleavage 



ATYLLEENKIKLERGQR FPETYHHVLHSLLLAI IPHVTIRYAEIPDE SRNVNY 

AQHVHNMDKRDS E RRTR r 5DRFMDDITT IVNWTSE I AALLVKPQKENEQAEKMNlJsLAF 
VHHLYFNDKLEAF RKSR FTERFMDDIAALVSTIASDIVSRFQKDTEM- 

AQHLIENSKVKL1RNQR rPASYHHAAETWNMLMPHI TQKFGDNPEA 

ALHLLLGQRLDTF RKLR f PGRFLDDITALVGSVGLEVI TRVHKDVEL 
AQHL I DTKKIQLE RPQR ~PESYQNELDNLVMVLSDHVI WKYKDALEE 



-VERLN1 SLAF 
-SKNAN1 SLAV 



SLAS 964 
970 
1076 
956 

AEHLNMSLAF 999 
TRRATtSVAR 1052 



FLKRCLTLMDRGFI F ML 
FLYDLLSLMDRGFVF WL 



K VLAE YK FE FLQT I CNHE HY I PLNL 
E TLISMKLEFLRILCSHEHYLNLNL 
FLNDLLSVMDRGFVFjSLIKSCYKQVSSKLYSLPNPE VLVSLRLDFLRI ICSHEHYVTLNL 

F TLFEYKFEFLRWCNHEHYIPLNL 



INDYISGFSPKDP— 
IRHYCSQLSAKLSNL 1 



FIKRCFT FMDRGFVF KQINNYISCFAPGDP 
jLSLVDRGFVF 5LVRAHYKQVATRLQ 
FLKRC FT FMDRGCVF KMVNNY I SM FS S GDL 



FLSDLLSLVDRGFVF 5LVRAHYKQVATRLQSSPNP7 1 ALLTLRMEFTRILCSHEHYVTLNL 



-PJTLCQYKFDFLQEVCQHEHFIPLCL 
. . . * . * * * + . . * * 



1019 
1027 
1136 
1011 
1059 
1107 



IK, S 

3 ofO 



hCLASP4 

hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
KCLASP5 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



Cadherin EC m otif 
PMAFAKPKLQR VQDS— NLEYSLSDEYCKHHFLVGdLLRdXSI 1060 

FFMNADTAPTSP — CPSI SSQNSSSCSSFQDQKIASMFDLTSEYRQQHFLTGI LFTE LAA 1085 
PCSLLTPPASPSPSVSSATSQSSGFSTWQDQKIA>WFELSVPFRQQHYLAGI VLTE LAV 1196 

PMPFGKGR I QR YQDL — QLDYS LTDE FCRNHFLVGI LLRE VGT 1052 

PCCPLSPPASPSPSVSSTTSQSSTFSSQAPDPKVTSMFELSGPFRQQHFXAGI LLTE LAL 1119 
PIRSANIPDPLTP SES TQELHASDMPEYSVTNEFCRKHFLIG1 LLREVGF 1157 



ALQDN 
ALDAEGEGI 
I 

ALQEFR - 

ALEPEAEGAFllLHKKAI 



--YE IRYTAISVIKNLLIKHAFDTRYQHKNQQAKIAQLYLPFVGLLLENI }RL 
Sfc VQRKAVSAIHSLLSSHDLDPRCVKPEVKVKIAALYLPLVGIILDAL P- 
LDPDAEGLFG LHKKVIN>T/HNLLSSHDSDPRYSDPQIKARVAMLYLPLIGIIMETV P — 
qVRiIAISVIJ<NLLIKHSFDDRYASRSHQARIATLYLPLFGLLIENV2RI 
SAVHSLLCGHDTDPRYAEATVKARVAELYXPLLS IARDTL P 
ALOE DQ rfVTUlIAIAVLKNLMAKHSFDDRYR£PRK 



1116 

— 1143 
1254 
1108 

— 1177 
1213 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



AGRDTLYSCA AMPN-S ASRDEFPCGFTSPANRGSLSTDKDTAYGS 1160 

q L CDFTVADTRRYRTSGSD 1162 

QLY DFTE THNQRGRP I C I ATDD — 127 6 

NVRX)VSPFPVNAGMTVKDESIALPA-\TOPLVTPQKGSTLDNSL 1167 

RLH DFAEGPGQRSRLASMLDSDTE 1201 

YLKDLYPFTVNTSNQGSRDDLSTNGGFQSQTAIKHANSVDTSFSKDVLNSIAAPSSIAIS 1273 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



FQ-NGHGIKREDSRGSLI PEGATGFPDQGNTGEN TRQSSTRS SVSQYNRLDQYE 1213 

EE QEGAGA I NQNVALA I AGNN FNLKT SGIVLSSLPYKQYNMLNADT 12 08 

YESESGSM I S QT VAMAI AGTSVPQLTR PGS FLLTSTSGRQHTT FS AES 132 4 

STPNINSVRNADSRGSLISTDSGNSLPERNSEKSNSLDIO^CK3SSTLGNSVVRCDKLDQSE 122 7 

GEGDIAGT INPSVAMAIAGGPLAPGSR AS ISQGPPTASRAGCALSAES 124 9 

TVNHADSRASLASLDSNPSTNEKSSEKTDNCEKIPRPLALIGSTLRFDRLDOAE 1327 



hCLASP4 

hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



] RSLLMCYLYIVKMI SEDTLLTYWNKVSPQELINILILLEVCLFHFRYMGKRNIAR 7HDA 127 3 
1 RKI>IICFXWIMKNAI)QSLIRJ<W1AI)LPSTQLNTIILDLLFICVLCFEYKGKQSSDK VSTQ 1268 
SRSLLICLLVT^KNADETVLQKWFTDLSVLQLNRLLDLLYLCVSCFXYKGKKVFEP^SL 1384 
] KSLIJ^CFLYILKSMSDDALFTYWNKASTSEI^IDFFTISEVCLHQFQYMGKRYIAR tfQEG 1287 
5 RTLLACVLWVLKNTEPALLQRWATDLTLPQLGRLLDLLYLCLAAFEYKGKKAFERINSL 1309 
1 RS LLMC FLH IMKT I S YETL I AYWQRAPS PEVS D FFS I LDVCLQN FRYLGKRN 1 1 R K I AA 1387 



hCLASP4 
hCLASP5 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



-KSQTMPALRNRSGVMQARLQHLSSLESS 1311 

LEEALLRGEGARGEMMRRRAPGNDRFPGLNEN 1311 

LEEAI LGS I GARQEMVRRSRGQLERS PSGSAFGS Q 1430 

— 1323 

— 1350 



WLSKHFGIDR 

VLQKSRDVKAR 

TFKKSKDMRAK 

LGPIVHDRKS QTLPVSRNRT04MHARLQQLGSLDNS- 

TFKKSLDMKAR LEEAILGTIGARQEMVRRSRERSPFGNPEN 

A FK FVQS TQNNGTLKG S N P S C QT S G LLAQWMH S T S RHEGHKQHRS QTLP 1 1 RGKN 



1442 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hrLASPl 



- FTLNHSSTTTEJ DI 



[ ILDMQENI IQASS 



[ fhqallegntatevsltvldtis 
--lrwkkeqthwrqanekldktkAeldqealisgnlatf^hli 
:nlrwrkdmthwrqntekldks rj e i e heal i dgnlateanl 1 1 ldtle i vt/qtvs 

ltfnhsyghsd;x)vlhqslleaniatevcltaldtlslf}tlafknqll 

- - vrwrks vt hwkqt s drvdkt ki eme heal vegnlat eas lwldt le 1 1 
--alsnpkllqmldntmtsnsne1 di vhhvdteant ategcltildlvslfttqthqrqlq 



FFfTQCFKTQLL 1359 
-ALD 1368 
VTE 1489 
1371 

✓QTVM-LSE 14 07 

1500 



4 ot'O 



hCLASP4 

hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



NNDGHNPLMKKVFD I HLATLKNGQS E VS LKHVFAS LRAF I S KFP S AFFKGRVNMCAAFCY 1419 

CKDS LLGGVLRVLVNSLNCDQSTTYLTHCFATLRALIAKFGDLLFEEEVEQCFDLCH 14 25 

SKES 1 LGGV1J<VLJ^SMACNQSAVYLQHCFATQRALVSKFPELLFEEETEQCADLCL 154 6 

ADHGHNPLMKKVFDVYLCFLQKHQSETALK>F/FTAIJISLI 14 31 

ARES VLGAVLKVVLYSLGSAQSALFLQHGIATQRALVSKFPELLFEEDTELCADLCL 14 64 

(^DCQNSLMKRGFDTYMLFFQVNQSATAlJafVFASLRLFVCKFPSAFFQGPADLCGSFCY 1560 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



EVTjKCCTSKISSTRNEASALLYLLMRNNFEYTKRKTFLRTHLQI I IAVSQLIADVALSGG 14 7 9 
QVLHHCSSSMDVTRSQACATLYLLMR — FS FGATS N FARVKMQVTMS LAS LVGRAPD FNE 1483 
RLLRHCSSS IGTIRSHPSASLYLLMR — QNFEIGNNFARVKMQVFMSLSSLVGTSQNFNE 1604 
E ILKCCNSKLSS I RTEASQLLYFLMRNNFDYTGKKS FVRTHLQVI I S VS QL IADWG I GE 1491 
RLLRHCGSRISTIRTHASASLYLLMR — QNFE I GHN FARVKMQVTMS LSSLVGTTQNFSE 1522 
EVLKCCNHRSRSTQTEASALLYLFMRKNFE FNKQKS I VRSHLQL I KAVSQLIADAG- 1 GG 1619 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP" 
hCLASPl 



SRFQESLFI INNFANSDRPMKATAFPAEVKDLTKRIRTVXMATAQMKEHEKDPEMLIDLfQ 1539 

EHLRRS LRT I LAYS EEDTAMQMT P FPTQVEELLCNLNS I L YDTVKMRE FQEDPEMLMDLM 154 3 

E FLRRS LKT I LTYAEEDLELRET T FPDQVQDLVFNLHM I LS DTVKMKEHQEDPEMLI DLM 1664 

TRFQQS LS I INN CAMS DRL I KHTS FS S DVKDL TKR I RTVLMAT AQMKE HEND P EMLVDLQ 1551 

S RFQHS LAI TNN FANGDKQMKNSN FPAEVKDLTKR I R T VIJ^T AQMKE HE KDPEMLVDLQ 167 9 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



transmembrane 

YSIAKSYASTPELRKTVnjDSMAKIHVKNGljFSEAAMCYVHVAALVAEFL HRKK 

YRIAKSYQASPDLRLTWLQNMAEKHTKKKC YTEAAMCLVHAAALVAEYL SMLEDH 

YRIAKGYQTSPE-RLTWLQNMAGKHSERSh HAEAAQCLVHSAALVAEYL SMLEDR 

YSLAKSYASTPELRKTWLDSMARIHVKNGI LS EAAMC YVHVT AL VAE Y L TRKG 

YRIARGYQGSPDLRLTWLQNMAGKHAELGh HAEAAQCMVHAAALVAEYL ALLEDQ 

YSLANSYASTPELRRTWLESMAKIHARNGI LSEAAMCYIHIAALIAEYL KRKGYVJKVEKI 
# .* + * *+ * .*** * .* .+*.**.-* 



1592 
1598 
1718 
1604 
1637 
1739 



L FPNGC S AFKK I T PN I DEEGAMKE DAGMMD 1622 

SYL PVGS VS FQN I S SNVLEES WSE DTLS PDEDGV 163 3 

KYLPVGCVTFQN I SSNVLEESAVS DDWS PDEEG I 17 53 

V FRQGC T AFRV I T PN I DE EASMME DVGMQD 163 4 

RHLPVGCVSFQNI SSNVLEESAISDDILSPDEEGF 167 2 

CTASLLSEDTHPCDSNSLLTTPSGGSMFSMGWPAFLSITPNIKEEGAAKEDSGMHD 1795 



IT AM 



hCLASP4 
hCLASP5 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASP5 
hCLASP3 
hCLASP2 
hCLASP7 



VHYS EEVLLE LLEQCVDG LWKAERYE 1 1 S E I S KL I VP I YEKRRE FEKLTQV YRT L HG 167 9 

CAGQY FTESGLVGL1XQAAELFSTGGLYETVNEVYKLVI P I LEAHREFRKLTLT HSK1QR 1693 

CSGKYFTESGLVGLLEQAAASFSMAC3^YEAVNEVYKVLIPIHEAKRI^ 1813 

VHFNEDVI>IELLEQCADGLWKAERYELIADIYKLIIPIYEKRR 1677 

CSGKHFTELGLVGLLEQAAGYFTMGGLYEAWEVYKNLIPILEAHRDYKKLAAVHGKI QE 17 32 
TPYNENILVEQLYMCGEFLWKSERYELIADVNKPI IAVFEKQRDFKKLSDL YYDIHR 1852 



IT AM 



DOCK motif 



DOCK motif 



riYTKl LEVMHTKKRLLGT FFRVAF YGQS FFEEEEGKEY I YKEE KLTGLSE I S LRLVKI YG 
AFDS 1 VNKDH- -KRMFGI YFR^ GF FG-SKFGDLC EQEF VYKEE AI TKLPE I SHRLEAi YG 
ATSK 1 VHQS TGWERMFG1 Y FR\ G F YG- 1 KFGDLE EQE E V YKE E AI TKLAE I SHRLEG I YG 
J E FFEDEEGKEY IYKEF KLTPLSE ISQRLLKI YS 

aWkimhqssgwervfgiyfr\jgfyg-ahfgdleeqee^ykeesitklaeishrleeiyt 



IT AM 

1739 
1750 
1872 
1710 
1791 



5 ot 0 



hCLASP4 

hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



hCLASP4 
hCLASPS 
hCLASP3 
hCLASP2 
hCLASP7 
hCLASPl 



TTAM 



ITAM 



EK FGTENVKI I QDSDKVNAKELDPJ YAHIQ\ 1 YVKi Y FDDKEL TERKT £ FERNHN I S R FV 17 99 



QC FGAEFVEVIKDSTPVDKTKLDP^ KAYIQ] TFVEI 



DKFGADNVKI I QDSNKVNPKDLDPt 



yayiq\ 



EPFGEDWEVIKDSNPVDKCKLDPh KAYIQ] TYVEI YFDTYEKKDRITYFDKNYNLRRFM 1932 
DKFGSENVKMIQDSGKVNPKDLDSfYAYIQ\TKVII FFDEKELQERKTEFERSHNIRRFM 177 0 
EBFGDDWEI IKDSYPVDKSKLDSC KAYIQ1 T YVEI YFDTYELKDRVTYFDRNYGLRTFL 1851 



TYVTI 



ITAM 



FEAPYTLSGKKQGCIEEQCKRRTILTTSNSFEtYVKHRIPINCEQQIh LKPIDGATDEIKD 

[SVIQKEEF\ LTPIEV7AIEDMKK 
YCT P FTLDGRAHGELHEQFKRKT I LTTS HAFFjY I K^RVNVTHKEE I ] LTP I EVAIEDMQK 

[ PVMYQHHTI LNPIEV7AI DEMSK 
FC T P FT PDGRAHGE LPEQHKRKT LLS TDHAFFjY I Ktf R I RVCHREE*n LT PVEVAI EDMQK 
FETPFTLSGKKHGGVAEQCKRRTILTTSHLFF YVKI RIQVISQSSTi LNPIEVAI DEMSR 



Coiled-coil 



YFDEYEMKDRVTYFEKNFNLRRFM 1810 



FFEEKEIEDRKTDFEMHHNINRFV 1972 



DOCK motif 



KTAELQKLCSSTDVDMIQLQLKLQC WVSVQVNAGPLAYARAFLNDSQASKYPPKKVSELK 
KT LQLAVAI NQEP PDAKMLOMVLQC S VGATVNQG PLEVAQVFLAEIPADPKLYRHHNKLR 
KTQELAFATHQDPADPKMLOJKVLQC SVGTTVNQGPLEVAQVFXSEIPSDPKLFRHHNKLR 
KVAELRQLCSSAEVDMIKLQLKLQt SVSVQVNAGPLAYARAFLDDTNTKRYPDNKVKLLK 
KTRELAFATEQDPPDAKMLQMVXQC SVGPTVNQGPLEVAQVFLAE I PEDPKLFRHHNKLR 
KVS ELNQLCTMEEVDM I S LQLKLQGS VS VKVNAG PMAYARAFLEETNAKKYPDNQVKLLK 



DMFRKFIQACS 
LC 



FKEFIMRCGI LAVEKNKRL 
LCFKDFTKRCEI 'ALRKNKSLI 



Coiled-coil 



IHEQILQEDTMHSP 



ALELNERLIKEDQVEYHEGLKSNFRDMVKELSDI ] 

ITADQREYQQELKKNYNKLKENLRPW IERKIPELYKPI FR 

GPVQKEYQRELGKLSSP 

EVFRQFVEACG<$ALAVNERL I KEDQLE YQEEMKANYREMAKELS E I MHEQ I C PLEEKTS - 
IGPDQKEYHRELERNYCRLREALQPLLTQRLPQLMAPTP- 
E I FRQ FADAC G(jAL DVNERL I KE DQLE YQEE LR S KYKDMLS E L S TV MNE Q I TG RDDL S KR 



PDZ ligand 
WMSNTLHVFCAISGTSSDRGYGSPFlXAEyj -- 2008 
VESQKRDS FHRSSFRKCETQLSQGS 2015 



1859 
1870 
1992 
1830 
1911 
2032 



1919 
1930 
2052 
1890 
1971 
2092 



1979 
1990 
2090 
1949 
2 030 
2152 



VLPNSLHI FNAISGTPTSTMVHGMTS jsSVVl 1980 

--PGLRNSLNRASFRKADL — 204 7 

GVDQTCTRVISKATPALPTVSIS gSAEV-t — 2180 



